CN108510083B - Neural network model compression method and device - Google Patents

Neural network model compression method and device Download PDF

Info

Publication number
CN108510083B
CN108510083B CN201810274146.3A CN201810274146A CN108510083B CN 108510083 B CN108510083 B CN 108510083B CN 201810274146 A CN201810274146 A CN 201810274146A CN 108510083 B CN108510083 B CN 108510083B
Authority
CN
China
Prior art keywords
neural network
network model
compressed
feature vector
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810274146.3A
Other languages
Chinese (zh)
Other versions
CN108510083A (en
Inventor
孙源良
王亚松
刘萌
樊雨茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201810274146.3A priority Critical patent/CN108510083B/en
Publication of CN108510083A publication Critical patent/CN108510083A/en
Application granted granted Critical
Publication of CN108510083B publication Critical patent/CN108510083B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a neural network model compression method and a device, wherein the method comprises the following steps: inputting training data into a neural network model to be compressed and a target neural network model; training a target neural network model based on the feature vectors and classification results extracted from the training data by the neural network model to be compressed to obtain a compressed neural network model; and the number of the target neural network model parameters is less than that of the neural network model parameters to be compressed. According to the embodiment of the invention, the target neural network model is guided to be trained based on the feature vectors and the classification results of the neural network model to be compressed on the training data, and the finally obtained classification results of the compressed neural network model and the neural network model to be compressed on the same training data are the same, so that the precision loss in the model compression process is avoided, the size of the model can be compressed on the premise of ensuring the precision, and the dual requirements on the precision and the size of the model are met.

Description

Neural network model compression method and device
Technical Field
The invention relates to the technical field of machine learning, in particular to a neural network model compression method and device.
Background
With the rapid development of neural networks in the fields of images, voice, texts and the like, the landing of a series of intelligent products is promoted. In order to enable the neural network to better learn the characteristics of training data so as to improve the model effect, parameters correspondingly used for representing the neural network model are rapidly increased, and the number of layers of the neural network is continuously increased, so that the deep neural network model has the defects of numerous parameters and large calculation amount in the model training and application process; this results in that the neural network-based products mostly depend on the driving of the server-side computing power, and depend on good operating environment and network environment, so that the application range of the neural network model is limited, for example, embedded application cannot be realized. In order to realize embedded application of the neural network model, the volume of the neural network model needs to be compressed below a certain range.
Current model compression methods generally include the following: firstly, pruning, namely after a large model is trained, removing parameters with small weights in a network model, and then continuing to train the model; secondly, the purpose of reducing the number of parameters is achieved through weight sharing; and thirdly, quantization, generally speaking, parameters of the neural network model are represented by floating point type numbers with the length of 32 bits, so that the high precision is not required to be reserved actually, and the space occupied by each weight can be reduced by quantization, for example, the precision represented by the original 32 bits is represented by 0-255, and the precision is sacrificed. And fourthly, carrying out binarization on the neural network, namely representing parameters of the network model by using binary numbers so as to achieve the purpose of reducing the size of the model body.
However, the above methods directly perform model compression on the model to be compressed, and perform model compression on the premise of sacrificing the precision of the model, which often fails to meet the requirement for precision.
Disclosure of Invention
In view of the above, an object of the embodiments of the present invention is to provide a method and an apparatus for compressing a neural network model, which can compress the size of the model while ensuring the accuracy of the neural network model.
In a first aspect, an embodiment of the present invention provides a neural network model compression method, where the method includes:
inputting training data into a neural network model to be compressed and a target neural network model;
training a target neural network model based on the feature vectors and classification results extracted from the training data by the neural network model to be compressed to obtain a compressed neural network model;
wherein the number of the target neural network model parameters is less than the number of the neural network model parameters to be compressed.
In a second aspect, an embodiment of the present invention further provides a neural network model compression apparatus, where the apparatus includes:
the input module is used for inputting the training data into the neural network model to be compressed and the target neural network model;
the training module is used for training a target neural network model based on the feature vectors and the classification results of the training data extracted by the neural network model to be compressed to obtain a compressed neural network model;
wherein the number of the target neural network model parameters is less than the number of the neural network model parameters to be compressed.
The neural network model compression method and device provided by the embodiment of the application can pre-construct a target neural network with the quantity of parameters less than that of the neural network model to be compressed when the neural network model to be compressed is compressed, then inputting the training data into the neural network model to be compressed and the target neural network model, guiding the target neural network model to train based on the feature vectors and classification results extracted from the training data by the neural network model to be compressed to obtain the compressed neural network model, wherein the classification results of the finally obtained compressed neural network model and the neural network model to be compressed on the same training data are the same, and then can not cause the loss of precision in the compression process of the model, therefore can compress the size of the model on the premise of guaranteeing the precision, satisfy the dual demand to precision and model size.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart illustrating a neural network model compression method according to an embodiment of the present application;
fig. 2 is a flowchart illustrating a specific method for training a target neural network model based on a classification result of a neural network model to be compressed on training data according to a second embodiment of the present application;
FIG. 3 is a schematic diagram illustrating a model compression process provided in the second embodiment of the present application;
fig. 4 is a flowchart illustrating a first comparison operation provided in the third embodiment of the present application;
fig. 5 is a flowchart of a specific method for performing similarity matching on the first feature vector and the second feature vector and performing the current round of training on the target neural network according to the result of the similarity matching, which is further provided in the fourth embodiment of the present application;
fig. 6 is a flowchart illustrating a similarity determination operation according to the fourth embodiment of the present application;
fig. 7 is a flowchart illustrating another specific method for performing similarity matching on the first feature vector and the second feature vector and performing the current round of training on the target neural network according to the result of the similarity matching according to the fifth embodiment of the present application;
fig. 8 is a flowchart illustrating a similarity determination operation according to a fifth embodiment of the present application;
FIG. 9 is a flowchart illustrating a neural network model compression method according to a sixth embodiment of the present disclosure;
fig. 10 is a schematic structural diagram illustrating a neural network model compression apparatus provided in a seventh embodiment of the present application;
fig. 11 shows a schematic structural diagram of a computer device according to an eighth embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
For the understanding of the present embodiment, a detailed description will be given to a neural network model compression method disclosed in the embodiment of the present invention, which can be used for compressing the sizes of various neural network models.
Referring to fig. 1, a neural network model compression method provided in an embodiment of the present application includes:
s101: and inputting the training data into the neural network model to be compressed and the target neural network model.
When the method is specifically implemented, the neural network model to be compressed is a neural network model with a large volume, is trained by training data, and is composed of a single neural network or a combination of a plurality of neural networks. It has a larger number of parameters than the target neural network model. The parameters herein may include the number of feature extraction layers of the neural network and/or parameters involved in each feature extraction layer.
Therefore, in order to compress the neural network model to be compressed, training data needs to be input into the neural network model to be compressed, the network model to be compressed is used to learn the characteristics of the training data, training of the neural network model to be compressed is achieved, the trained neural network model to be compressed is obtained, and the trained neural network model to be compressed is used as the neural network model to be compressed.
The target neural network model is a pre-configured neural network model, which has fewer parameters than the neural network model to be compressed, for example, the number of feature extraction layers is smaller, the neural network structure is simpler, and the number of feature extraction layers is smaller.
Here, it should be noted that, if the neural network model to be compressed is obtained by using an unsupervised training method, the training data is unlabeled; if the neural network model to be compressed is obtained by using a supervised training method, the training data are labeled; if the neural network model to be compressed is obtained by using a transfer learning training method, the training data can be labeled or unlabeled.
S102: training a target neural network model based on the feature vectors and classification results extracted from the training data by the neural network model to be compressed to obtain a compressed neural network model;
when the method is specifically realized, training data are input into the neural network model to be compressed and the target neural network model, the classification result of the network model to be compressed on the training data is used for guiding the training of the target neural network model, and in the process of training the target neural network model, the classification result of the training data is close to the classification result of the neural network model with compression on the training data as much as possible.
In the neural network model compression method provided by the embodiment of the application, when the neural network model to be compressed is compressed, a target neural network with the number of parameters less than that of the parameters of the neural network model to be compressed is constructed in advance, then the training data is input into the neural network model to be compressed and the target neural network model, the target neural network model is guided to be trained on the basis of the feature vectors and the classification results extracted from the training data by the neural network model to be compressed, and the compressed neural network model is obtained, the process is not operated on the neural network model with compression, and the finally obtained classification results of the compressed neural network model and the neural network model to be compressed on the same training data are the same, so that the precision loss can not be caused in the model compression process, the size of the model is compressed, and the dual requirements on the precision and the size of the model are met.
Specifically, the network model to be compressed generally includes: a neural network to be compressed and a classifier to be compressed. The target neural network model generally includes: a target neural network and a target classifier; the compressed neural network model obtained by training comprises the following steps: a compression neural network and a compression classifier.
Referring to fig. 2, a second embodiment of the present application further provides a specific method for training a target neural network model based on a classification result of a to-be-compressed neural network model on training data, including:
s201: and extracting a first feature vector for input training data by using the neural network to be compressed, and extracting a second feature vector for the input training data by using the target neural network.
S202: performing similarity matching on the first feature vector and the second feature vector, and performing the training of the target neural network according to the result of the similarity matching;
s203: inputting the first feature vector to a classifier to be compressed to obtain a first classification result;
inputting the second feature vector into a target classifier to obtain a second classification result;
s204: performing the training of the target neural network and the target classifier in the current round according to the comparison result of the first classification result and the second classification result;
s205: and performing multi-round training on the target neural network and the target classifier to obtain a compressed neural network model.
In specific implementation, referring to a schematic diagram of a model compression process shown in fig. 3, for convenience of describing the embodiment of the present application, two functional modules, a similarity matching module and a comparison module are introduced in the embodiment. The similarity matching module is used for performing similarity matching on the first feature vector and the second feature vector; the comparison module is used for comparing the first classification result with the second classification result.
The training data is input to the neural network model to be compressed and the target neural network model. After the training data are input into the neural network model to be compressed, two processes are executed, firstly, the neural network to be compressed performs feature extraction on the training data to obtain a first feature vector of the training data; and then transmitting the first feature vector to a classifier to be compressed, and classifying the training data represented by the first feature vector by the classifier to be compressed based on the first feature vector to obtain a first classification result.
Similarly, after the training data are input into the target neural network model, two processes are also executed, firstly, the target neural network performs feature extraction on the training data to obtain a second feature vector of the training data; and then transmitting the second feature vector to a target classifier, and classifying the training data represented by the second feature vector by the target classifier based on the second feature vector to obtain a second classification result.
The process of compressing the neural network model to be compressed is actually to realize the training of guiding the target neural network model through the neural network model to be compressed, so that the classification results of the compressed neural network model obtained by training are consistent with the classification results of the neural network model to be compressed on the same training data, namely, when the neural network to be compressed and the compressed neural network extract the characteristics of the same training data, the similarity between the obtained characteristic vectors is as close as possible; meanwhile, when the classifier to be compressed and the compression classifier classify the training data represented by the classifier to be compressed respectively based on the feature vectors which are as close as possible, the classification results are consistent. Thus, when training the target neural network model, both the target neural network and the target classifier are trained.
In the training process, the parameters of the target neural network are influenced by the similarity matching result of the first feature vector and the second feature vector, and the parameters of the target neural network are adjusted according to the similarity matching result. The first feature vector and the second feature vector are difficult to be consistent due to different parameters in the target neural network and the neural network to be compressed. Therefore, the target neural network needs to approach the second feature vector extracted from the training data to the first feature vector as much as possible; meanwhile, the parameters of the target neural network are also influenced by a second classification result of the training data classified by the target classifier on the second feature vector, and when the second classification result is inconsistent with the first classification result, the parameters of the target neural network are adjusted, so that the second classification result obtained by the target classifier is consistent with the first classification result.
And when the first classification result is inconsistent with the second classification result, adjusting the parameters of the target classifier to enable the second classification result to be consistent with the first classification result.
Then, after the training data are input into the neural network model to be compressed and the target neural network model, firstly, the neural network to be compressed is used for extracting a first feature vector for the training data of the data, the target neural network is used for extracting a second feature vector for the input training data, then the first feature vector and the second feature vector of the same training data are transmitted to a similarity matching module, the similarity matching module is used for carrying out similarity matching on the first feature vector and the second feature vector, and the target neural network is subjected to the training of the current round according to the similarity matching; meanwhile, the first feature vector is input into a classifier to be compressed to obtain a first classification result, the second feature vector is input into a target classifier to obtain a second classification result, the first classification result and the second classification result are transmitted to a comparison module, the comparison module is used for comparing the first classification result with the second classification result, and the target neural network and the target classifier are subjected to the training in the current round according to the comparison result.
And performing multi-round training on the target neural network and the target classifier to obtain a compressed neural network model.
It should be noted that the current round of training refers to training the target neural network model by using the same training data until the second feature vector obtained by feature extraction of the training data by the target neural network and the second classification result obtained by classification both meet preset conditions; the multi-round training refers to training a target neural network by using a plurality of training data, and each training data performs one round of training on the target neural network.
Specifically, a third embodiment of the present application further provides a specific method for performing a training round of a target neural network and a target classifier according to a comparison result between a first classification result and a second classification result, including: and executing the following first comparison operation until the classification loss of the target neural network model conforms to the preset loss range, and finishing the current round of training of the target neural network and the target classifier.
Referring to fig. 4, the first comparison operation includes:
s401: comparing whether the first classification result is consistent with the second classification result; if yes, jumping to S402; if not, it jumps to S403.
S402: completing the current round of training of the target neural network and the target classifier; the process is ended.
S403: generating first feedback information, and adjusting parameters of the target neural network and the target classifier based on the first feedback information;
s404: based on the adjusted parameters, a new second classification result is determined for the training data using the target neural network and the target classifier, and S401 is performed again.
In the specific implementation, the accuracy of the compressed neural network model obtained after the target neural network model is trained for multiple rounds is to be ensured, that is, the classification results of the compressed neural network model and the neural network model to be compressed on the same training data are to be ensured to be consistent. Therefore, the first classification result and the second classification result are compared by using the comparison module. When the comparison results are inconsistent, generating first feedback information, and adjusting parameters of the target neural network and the target classifier based on the first feedback information to obtain the target neural network and the target classifier after the parameters are adjusted; and determining a new second classification result for the training data by using the target neural network and the target classifier after the parameters are adjusted, performing the first comparison operation based on the first classification result and the new second classification result, and repeating the process until the first classification result is consistent with the second classification result.
In addition, referring to fig. 5, a fourth embodiment of the present application further provides a specific method for performing similarity matching on a first feature vector and a second feature vector, and performing the current round of training on a target neural network according to a result of the similarity matching, including:
s501: clustering the first feature vector and the second feature vector respectively;
s502: generating a first adjacency matrix according to the result of clustering the first eigenvector; generating a second adjacency matrix according to the result of clustering the second eigenvector;
s504: and performing the current round of training on the parameters of the target neural network according to the similarity between the first adjacent matrix and the second adjacent matrix.
In a specific implementation, the first feature vector may be regarded as a point mapped to a high-dimensional space, the points are respectively clustered according to a distance between the point and the point, the points whose distance is within a preset threshold are classified into the same class, and then a first adjacency matrix of the distance between the point and the point is formed according to a clustering result.
In the first adjacency matrix, if two points belong to the same class during clustering, the distance between the two points is 1; if two points do not belong to the same class at the time of clustering, the distance between the two points is 0.
For example, there are 5 training data, and the obtained first feature vectors are: 1. 2, 3, 4 and 5. Wherein, the result of clustering the first feature vector is as follows: {1,3}, {2}, and {4,5}, the adjacency matrix formed is:
Figure GDA0003004422250000101
the second adjacency matrix is formed according to the result of clustering the second eigenvector, which is similar to the above, and thus is not described again.
The fifth embodiment of the present application further provides a method for performing a current round of training on parameters of a target neural network according to a similarity between a first adjacency matrix and a second adjacency matrix, where the method includes: performing the following similarity determination operation until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset first similarity threshold value, and finishing the current round of training of the target neural network;
referring to fig. 6, the similarity determination operation includes:
s601: and comparing whether the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset first similarity threshold value. If so, go to S602; if not, S603 is carried out.
Here, in a specific implementation, when the similarity between the first adjacent matrix and the second adjacent matrix obtained at present is calculated, the trace of the first adjacent matrix and the trace of the second adjacent matrix are calculated, and the closer the distance between the trace of the first adjacent matrix and the trace of the second adjacent matrix is, the higher the similarity between the first adjacent matrix and the second adjacent matrix is. When solving for the distance between the traces of the first adjacent matrix and the traces of the second adjacent matrix, the difference between the traces of the first adjacent matrix and the traces of the second adjacent matrix may be taken as the similarity between the first adjacent matrix and the second adjacent matrix, that is, the greater the absolute value of the difference between the traces of the first adjacent matrix and the traces of the second adjacent matrix, the lower the similarity between the first adjacent matrix and the second adjacent matrix.
S602: and finishing the current round of training of the target neural network. The process is ended.
S603: generating first feedback information, and adjusting parameters of the target neural network based on the first feedback information;
s604: extracting a new second feature vector for the training data by using the target neural network based on the adjusted parameters; clustering the new second feature vector to generate a new second adjacency matrix, and performing S601 again.
In a specific implementation, since the higher the similarity between the first adjacent matrix and the second adjacent matrix is, the more similar the classification result of the first adjacent matrix characterizing the first feature vector is to the classification result of the second adjacent matrix characterizing the second feature vector, the parameter adjustment is performed on the target neural network according to the similarity between the first adjacent matrix and the second adjacent matrix, so that the second feature vector obtained by performing feature extraction on training data by the target neural network is closer to the first feature vector obtained by performing feature extraction on the training data by using the neural network to be compressed.
In addition, referring to fig. 7, a fifth embodiment of the present application further provides another specific method for performing similarity matching on the first feature vector and the second feature vector, and performing the current round of training on the target neural network according to a result of the similarity matching, where the method includes:
s701: and respectively carrying out dimensionality reduction operation on the first eigenvector and the second eigenvector to obtain a first dimensionality reduction eigenvector of the first eigenvector and a second dimensionality reduction eigenvector of the second eigenvector.
In a specific implementation, the dimensionality reduction operation is performed on the first feature vector and the second feature vector, and the first reduced-dimensionality feature vector and the second reduced-dimensionality feature vector can be obtained by re-encoding the first feature vector and the second feature vector, for example, using a full-connection layer to perform feature capture on the first feature vector and the second feature vector again.
S702: and calculating the similarity of the first dimension-reduced feature vector and the second dimension-reduced feature vector.
Here, in calculating the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector, calculating the difference between the two vectors and taking the difference between the two vectors as the result of the similarity may be adopted. Or, element-to-element subtraction may be performed directly on the first dimension-reduced feature vector and the second dimension-reduced feature vector, and the result of the subtraction is taken as the result of the similarity; or, the first dimension-reduced feature vector and the second dimension-reduced feature vector can be regarded as points and projected into a corresponding space, and the difference between the point distributions can be calculated. For example,projecting the first dimension-reducing feature vector and the second dimension-reducing feature vector into corresponding spaces, wherein the obtained points are respectively as follows: s (X)1,Y1,Z1),M(X2,Y2,Z2) The distance between two points, L ═ X1-X2)2+(Y1-Y2)2+(Z1-Z2)2As the similarity of both; the smaller the distance, the greater the similarity.
S703: and performing the training of the parameters of the target neural network in the current round according to the similarity of the first dimension-reducing feature vector and the second dimension-reducing feature vector.
Here, the parameters of the target neural network are trained according to the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector, and it is actually required to ensure that the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is within the preset second similarity threshold. In particular to
The following similarity determination operation can be performed until the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is smaller than a preset second similarity threshold, and the current round of training on the target neural network is completed.
Referring to fig. 8, the similarity determination operation includes:
s801: comparing whether the similarity between the first dimension-reducing feature vector and the second dimension-reducing feature vector is smaller than a preset second similarity threshold value or not; if yes, executing S302; if not, S803 is executed.
Here, the method for calculating the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector may be referred to the description of S702, and is not repeated herein.
S802: and finishing the current round of training of the target neural network. The process is ended.
S803: and generating second feedback information, and adjusting parameters of the target neural network based on the second feedback information.
S804: based on the adjusted parameters, a new second feature vector is extracted for the training data using the target neural network. And performing dimensionality reduction operation on the new second feature vector to generate a new second dimensionality reduction feature vector, and performing S801 again.
Specifically, to ensure that the first feature vector and the second feature vector are as close as possible, the similarity between the first feature vector and the second feature vector is smaller than a certain threshold, that is, the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is smaller than a preset second similarity threshold. When the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is not less than a preset second similarity threshold, second feedback information is generated correspondingly, and parameters of the target neural network are adjusted based on the second feedback information, so that the target neural network can change towards the direction of increasing the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector when the second feature vector is extracted for the training data again. And then extracting a new second feature vector for the training data by using the target neural network with the adjusted parameters, performing dimensionality reduction operation on the new second feature vector again to generate a new second dimensionality reduction feature vector, and performing the similarity determination operation again until the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector is smaller than a preset second similarity threshold value.
By using the compressed neural network model obtained in the first embodiment of the application, the accuracy of the compressed neural network model can be ensured to be consistent with that of the neural network model to be compressed; for a to-be-compressed network model obtained by an unsupervised learning or transfer learning training method, if the to-be-compressed neural network model is wrong in classification of certain training data, the to-be-compressed neural network model also can be wrong in classification of the training data to a certain extent. The sixth embodiment of the application also provides another neural network model compression method, which can further improve the precision of the compressed neural network model.
Referring to fig. 9, before performing similarity matching on the first feature vector and the second feature vector, the neural network model compression method according to the sixth embodiment of the present application further includes:
s901: a noise addition operation is performed on the first feature vector.
In specific implementation, noise addition is performed on the first feature vector in order to increase generalization capability of the compressed neural network model obtained through training. Generalization ability refers to the ability of a machine learning algorithm to adapt to a fresh sample. When the noise adding operation is performed on the first feature vector, noise of different degrees or different types may be added to the first feature vector multiple times. Each addition of noise generates a first feature vector added with noise, and each first feature vector added with noise generates a certain degree of deviation to the original first feature vector, so that one training data can obtain a plurality of deviated first feature vectors. Meanwhile, the data volume of the first characteristic vector can be enriched, input training data are reduced under the condition that the data volume of the first characteristic vector is not changed, and data can be better fitted. In addition, the neural network model to be compressed is not necessarily very accurate for the classification of some training data, so that the addition of the variation of the first feature vector may make the first feature vector added with noise more practical, and better guidance for the training of the target neural network model is realized.
When noise is added to the first feature vector, a noise vector having the same dimension as the first feature vector is generally constructed, and noise is added to the first feature vector by adding the first feature vector and position data corresponding to the noise vector.
When constructing a noise vector having the same dimension as the first feature vector, the noise vector may be constructed directly or indirectly. Direct construction means that a noise vector having the same dimension as the first feature vector is directly generated, for example, when the dimension of the first feature vector is 1 × 1000, the constructed noise vector is also 1 × 1000. The indirect construction means that a noise vector with dimensionality lower than that of the first characteristic vector is generated, and then the noise vector with dimensionality same as that of the first characteristic vector is generated in a zero filling mode of the noise vector; for example, when the dimension of the first feature vector is 1 × 1000, the constructed intermediate noise vector is 1 × 500; 0 is filled in an arbitrary position of the intermediate noise vector, and a noise vector having a dimension of 1 × 1000 is finally formed.
In addition, because the first feature vector can be subjected to noise of different degrees for multiple times, or noise of different degrees can be added for multiple times, the noise of different degrees can be obtained by adopting a mode of changing parameters in a noise generation algorithm; or the method of indirectly constructing the noise vector is adopted, and the method of filling 0 in different positions is adopted to obtain the noise vector; different kinds of noise can be obtained by changing the noise generation algorithm.
S902: and performing similarity matching on the first feature vector and the second feature vector added with the noise.
The method for matching similarity between the first feature vector and the second feature vector added with noise is similar to the method for matching similarity between the first feature vector and the second feature vector not added with noise, and reference may be specifically made to the above description, and details are not repeated here.
In addition, in this embodiment, since noise is added to the first feature vector, when the classifier to be compressed is used to classify the first feature vector to which the noise is added, the classification result may be different from the original classification result of the first feature vector, and if the classification result is not corrected, the accuracy of the finally obtained compressed neural network model may be affected.
Therefore, in the embodiment of the present application, while performing similarity matching on the first feature vector and the second feature vector to which noise is added and completing training on the target neural network based on the similarity matching result, a second comparison operation is performed until the first classification result is consistent with the label of the training data, and the current round of training on the neural network to be compressed and the classifier to be compressed is completed;
the second alignment operation comprises:
comparing the first classification result with the label of the training data;
generating first scattered feedback information aiming at the condition that the comparison results are inconsistent, and adjusting parameters of the neural network to be compressed and the classifier to be compressed based on the first scattered feedback information;
and based on the adjusted parameters, extracting a new first classification result for the training data by using the neural network to be compressed and the classifier to be compressed, and executing a second comparison operation again.
The fine adjustment of the neural network model to be compressed can be realized through the steps, so that the neural network model to be compressed and the compressed neural network model obtained through training can have better generalization capability and higher precision.
Based on the same inventive concept, the embodiment of the present invention further provides a neural network model compression apparatus corresponding to the neural network model compression method, and as the principle of the apparatus in the embodiment of the present invention for solving the problem is similar to the neural network model compression method described above in the embodiment of the present invention, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 10, a neural network model compression apparatus provided by the seventh embodiment of the present invention specifically includes:
the input module 11 is used for inputting the training data into the neural network model to be compressed and the target neural network model;
the first training module 12 is configured to train the target neural network model based on feature vectors and classification results extracted from training data by the neural network model to be compressed, so as to obtain a compressed neural network model;
and the number of the target neural network model parameters is less than that of the neural network model parameters to be compressed.
The neural network model compression device provided by the embodiment of the application, when compressing the neural network model to be compressed, pre-constructs a target neural network with the quantity of parameters less than that of the neural network model to be compressed, then inputs training data into the neural network model to be compressed and the target neural network model, guides the target neural network model to be trained based on the feature vectors and classification results extracted from the training data by the neural network model to be compressed, and obtains the compressed neural network model, the process is not to operate on the neural network model with compression, and the finally obtained classification results of the compressed neural network model and the neural network model to be compressed on the same training data are the same, so that the loss of precision can not be caused in the model compression process, therefore, on the premise of ensuring the precision, the size of the model is compressed, and the dual requirements on the precision and the size of the model are met.
Optionally, the method further comprises: and the second training module 13 is configured to input training data into the neural network model to be compressed before inputting the training data into the neural network model to be compressed and the target neural network model, and train the neural network model to be compressed to obtain the trained neural network model to be compressed.
Optionally, the neural network model to be compressed includes: a neural network to be compressed and a classifier to be compressed; the target neural network model includes: a target neural network and a target classifier;
the first training module 12 is specifically configured to: extracting a first characteristic vector by using a neural network to be compressed as input training data, and extracting a second characteristic vector by using a target neural network as the input training data;
performing similarity matching on the first feature vector and the second feature vector, and performing the training of the target neural network according to the result of the similarity matching; and
inputting the first feature vector to a classifier to be compressed to obtain a first classification result;
inputting the second feature vector into a target classifier to obtain a second classification result;
performing the training of the target neural network and the target classifier in the current round according to the comparison result of the first classification result and the second classification result;
and performing multi-round training on the target neural network and the target classifier to obtain a compressed neural network model.
Optionally, the first training module 12 is specifically configured to perform the following first comparison operation until the classification loss of the target neural network model meets a preset loss range, so as to complete the current round of training on the target neural network and the target classifier;
the first comparison operation includes:
comparing the first classification result with the second classification result;
generating first feedback information aiming at the condition that the comparison result is inconsistent, and adjusting parameters of the target neural network and the target classifier based on the first feedback information;
based on the adjusted parameters, a new second classification result is determined for the training data using the target neural network and the target classifier, and the first comparison operation is performed again.
Optionally, the first training module 12 is further configured to: before similarity matching is carried out on the first feature vector and the second feature vector, noise adding operation is carried out on the first feature vector; and performing similarity matching on the first feature vector and the second feature vector added with the noise.
Optionally, the first training module 12 is specifically configured to perform similarity matching on the first feature vector and the second feature vector through the description steps, and perform a current training on the target neural network according to a result of the similarity matching: clustering the first feature vector and the second feature vector respectively;
generating a first adjacency matrix according to the result of clustering the first eigenvector;
generating a second adjacency matrix according to the result of clustering the second eigenvector;
and performing the current round of training on the parameters of the target neural network according to the similarity between the first adjacent matrix and the second adjacent matrix.
Optionally, the first training module 12 is specifically configured to perform the following similarity determination operation until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset first similarity threshold, so as to complete the current round of training on the target neural network;
the similarity determination operation includes:
calculating the similarity between the first adjacency matrix and the second adjacency matrix which are obtained currently;
generating first feedback information aiming at the condition that the similarity is not less than a preset first similarity threshold, and adjusting parameters of the target neural network based on the first feedback information;
extracting a new second feature vector for the training data by using the target neural network based on the adjusted parameters;
and clustering the new second eigenvector to generate a new second adjacency matrix, and performing similarity determination operation again.
Optionally, the first training module 12 is specifically configured to perform similarity matching on the first feature vector and the second feature vector through the following steps, and perform a current training on the target neural network according to a result of the similarity matching:
respectively carrying out dimensionality reduction operation on the first eigenvector and the second eigenvector to obtain a first dimensionality reduction eigenvector of the first eigenvector and a second dimensionality reduction eigenvector of the second eigenvector;
calculating the similarity of the first dimension-reducing feature vector and the second dimension-reducing feature vector;
and performing the training of the parameters of the target neural network in the current round according to the similarity of the first dimension-reducing feature vector and the second dimension-reducing feature vector.
Optionally, the first training module 12 is specifically configured to perform the following similarity determination operation until the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is smaller than a preset second similarity threshold, and complete the current round of training on the target neural network;
the similarity determination operation includes:
calculating the similarity between the first dimension-reducing feature vector and the second dimension-reducing feature vector which are obtained currently;
generating second feedback information aiming at the condition that the similarity is not less than a preset second similarity threshold, and adjusting parameters of the target neural network based on the second feedback information;
extracting a new second feature vector for the training data by using the target neural network based on the adjusted parameters;
and carrying out dimensionality reduction operation on the new second feature vector to generate a new second dimensionality reduction feature vector, and carrying out similarity determination operation again.
Corresponding to the neural network model compression method in fig. 1, an eighth embodiment of the present invention further provides a computer device, as shown in fig. 11, the computer device includes a memory 1000, a processor 2000 and a computer program stored in the memory 1000 and executable on the processor 2000, where the processor 2000 implements the steps of the neural network model compression method when executing the computer program.
Specifically, the memory 1000 and the processor 2000 can be general memories and general processors, which are not specifically limited herein, and when the processor 2000 runs a computer program stored in the memory 1000, the neural network model compression method can be executed, so as to solve the problem that the existing model compression method cannot meet the requirement for precision use because the model compression is performed on the premise of sacrificing the precision of the model, and further achieve the effect of compressing the size of the model under the condition of ensuring the precision of the neural network model.
Corresponding to the neural network model compression method in fig. 1, a ninth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the neural network model compression method.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is run, the neural network model compression method can be executed, so that the problem that the existing model compression method cannot meet the requirement for precision use because model compression is performed on the premise of sacrificing the precision of the model is solved, and the effect of compressing the size of the model under the condition of ensuring the precision of the neural network model is achieved.
The neural network model compression method and the computer program product of the apparatus provided in the embodiments of the present invention include a computer readable storage medium storing a program code, and instructions included in the program code may be used to execute the method in the foregoing method embodiments, and specific implementation may refer to the method embodiments, and will not be described herein again.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A neural network model compression method, the method comprising:
inputting image data to be trained, voice data to be trained and character data to be trained into a neural network model to be compressed and a target neural network model;
training a target neural network model based on the feature vectors and classification results extracted from the to-be-compressed neural network model on the to-be-trained image data, the to-be-trained voice data and the to-be-trained character data to obtain a compressed neural network model; the compressed neural network model is used for executing an image processing task on image data, or executing a voice processing task on voice data, or executing a word processing task on word data; the number of the target neural network model parameters is less than that of the neural network model parameters to be compressed;
the neural network model to be compressed comprises: a neural network to be compressed and a classifier to be compressed; the target neural network model includes: a target neural network and a target classifier;
the method comprises the following steps of training a target neural network model based on feature vectors and classification results extracted from the to-be-compressed neural network model on the to-be-trained image data, the to-be-trained voice data and the to-be-trained character data to obtain a compressed neural network model, and specifically comprises the following steps:
extracting a first characteristic vector for the input image data to be trained, the input voice data to be trained and the input text data to be trained by using the neural network to be compressed, and extracting a second characteristic vector for the input image data to be trained, the input voice data to be trained and the input text data to be trained by using the target neural network;
performing similarity matching on the first feature vector and the second feature vector, and performing the current round of training on the target neural network according to the result of the similarity matching; and
inputting the first feature vector to the classifier to be compressed to obtain a first classification result;
inputting the second feature vector to the target classifier to obtain a second classification result;
performing a current round of training on the target neural network and the target classifier according to a comparison result of the first classification result and the second classification result;
performing multi-round training on the target neural network and the target classifier to obtain a compressed neural network model;
executing the following first comparison operation until the classification loss of the target neural network model conforms to a preset loss range, and finishing the current round of training of the target neural network and the target classifier;
the first comparison operation includes:
comparing the first classification result with the second classification result;
generating first feedback information aiming at the condition that the comparison result is inconsistent, and carrying out parameter adjustment on the target neural network and the target classifier based on the first feedback information;
and determining a new second classification result for the image data to be trained, the voice data to be trained and the character data to be trained by using a target neural network and a target classifier based on the adjusted parameters, and executing the first comparison operation again.
2. The method of claim 1, further comprising, prior to inputting the training data into the neural network model to be compressed and the target neural network model:
and inputting the image data to be trained, the voice data to be trained and the character data to be trained into the neural network model to be compressed, and training the neural network model to be compressed to obtain the trained neural network model to be compressed.
3. The method of claim 1, wherein before the similarity matching the first feature vector and the second feature vector, further comprising:
performing a noise addition operation on the first feature vector;
the performing similarity matching on the first feature vector and the second feature vector specifically includes:
and performing similarity matching on the first feature vector and the second feature vector added with the noise.
4. The method according to any one of claims 1 to 3, wherein the performing similarity matching on the first feature vector and the second feature vector and performing a current training on the target neural network according to a result of the similarity matching specifically comprises:
clustering the first feature vector and the second feature vector respectively;
generating a first adjacency matrix according to the result of clustering the first eigenvector;
generating a second adjacency matrix according to the result of clustering the second eigenvector;
and performing the training of the current round on the parameters of the target neural network according to the similarity between the first adjacent matrix and the second adjacent matrix.
5. The method according to claim 4, wherein the following similarity determination operation is performed until the similarity between the first adjacent matrix and the second adjacent matrix is smaller than a preset first similarity threshold, and the current round of training of the target neural network is completed;
the similarity determination operation includes:
calculating the similarity between the first adjacency matrix and the second adjacency matrix which are obtained currently;
generating first feedback information aiming at the condition that the similarity is not less than a preset first similarity threshold, and carrying out parameter adjustment on the target neural network based on the first feedback information;
based on the adjusted parameters, extracting new second feature vectors for the image data to be trained, the voice data to be trained and the character data to be trained by using a target neural network;
and clustering the new second eigenvector to generate a new second adjacency matrix, and executing the similarity determination operation again.
6. The method according to any one of claims 1 to 3, wherein the performing similarity matching on the first feature vector and the second feature vector and performing a current training on the target neural network according to a result of the similarity matching specifically comprises:
respectively carrying out dimensionality reduction operation on the first eigenvector and the second eigenvector to obtain a first dimensionality reduction eigenvector of the first eigenvector and a second dimensionality reduction eigenvector of the second eigenvector;
calculating the similarity of the first dimension-reduced feature vector and the second dimension-reduced feature vector;
and performing the current round of training on the parameters of the target neural network according to the similarity of the first dimension-reducing feature vector and the second dimension-reducing feature vector.
7. The method of claim 6, wherein the following similarity determination operations are performed until the similarity between the first dimension-reduced feature vector and the second dimension-reduced feature vector is smaller than a preset second similarity threshold, and the current round of training on the target neural network is completed;
the similarity determination operation includes:
calculating the similarity between the first dimension-reducing feature vector and the second dimension-reducing feature vector which are obtained currently;
generating second feedback information aiming at the condition that the similarity is not less than a preset second similarity threshold, and carrying out parameter adjustment on the target neural network based on the second feedback information;
based on the adjusted parameters, extracting new second feature vectors for the image data to be trained, the voice data to be trained and the character data to be trained by using a target neural network;
and carrying out dimensionality reduction operation on the new second feature vector to generate a new second dimensionality reduction feature vector, and executing the similarity determination operation again.
8. A neural network model compression device is applied to an embedded device, and comprises:
the input module is used for inputting image data to be trained, voice data to be trained and character data to be trained into the neural network model to be compressed and the target neural network model;
the first training module is used for training a target neural network model based on the feature vectors and classification results extracted by the neural network model to be compressed from the image data to be trained, the voice data to be trained and the character data to be trained, so as to obtain a compressed neural network model; the compressed neural network model is used for executing an image processing task on image data, or executing a voice processing task on voice data, or executing a word processing task on word data; the number of the target neural network model parameters is less than that of the neural network model parameters to be compressed;
the neural network model to be compressed comprises: a neural network to be compressed and a classifier to be compressed; the target neural network model includes: a target neural network and a target classifier;
the first training module is specifically configured to: extracting a first characteristic vector for the input image data to be trained, the input voice data to be trained and the input text data to be trained by using the neural network to be compressed, and extracting a second characteristic vector for the input image data to be trained, the input voice data to be trained and the input text data to be trained by using the target neural network;
performing similarity matching on the first feature vector and the second feature vector, and performing the current round of training on the target neural network according to the result of the similarity matching; and
inputting the first feature vector to the classifier to be compressed to obtain a first classification result;
inputting the second feature vector to the target classifier to obtain a second classification result;
performing a current round of training on the target neural network and the target classifier according to a comparison result of the first classification result and the second classification result;
performing multi-round training on the target neural network and the target classifier to obtain a compressed neural network model;
the first training module is specifically configured to perform a first comparison operation until the classification loss of the target neural network model meets a preset loss range, and complete a current round of training on the target neural network and the target classifier;
the first comparison operation includes:
comparing the first classification result with the second classification result;
generating first feedback information aiming at the condition that the comparison result is inconsistent, and carrying out parameter adjustment on the target neural network and the target classifier based on the first feedback information;
and determining a new second classification result for the image data to be trained, the voice data to be trained and the character data to be trained by using a target neural network and a target classifier based on the adjusted parameters, and executing the first comparison operation again.
CN201810274146.3A 2018-03-29 2018-03-29 Neural network model compression method and device Active CN108510083B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810274146.3A CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810274146.3A CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Publications (2)

Publication Number Publication Date
CN108510083A CN108510083A (en) 2018-09-07
CN108510083B true CN108510083B (en) 2021-05-14

Family

ID=63379557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810274146.3A Active CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Country Status (1)

Country Link
CN (1) CN108510083B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200019840A1 (en) * 2018-07-13 2020-01-16 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process
CN110929839B (en) * 2018-09-20 2024-04-16 深圳市商汤科技有限公司 Method and device for training neural network, electronic equipment and computer storage medium
CN110163236B (en) * 2018-10-15 2023-08-29 腾讯科技(深圳)有限公司 Model training method and device, storage medium and electronic device
CN111242273B (en) * 2018-11-29 2024-04-12 华为终端有限公司 Neural network model training method and electronic equipment
WO2020108368A1 (en) * 2018-11-29 2020-06-04 华为技术有限公司 Neural network model training method and electronic device
CN110008880B (en) * 2019-03-27 2023-09-29 深圳前海微众银行股份有限公司 Model compression method and device
CN112020724A (en) * 2019-04-01 2020-12-01 谷歌有限责任公司 Learning compressible features
WO2020231049A1 (en) 2019-05-16 2020-11-19 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
CN110211121B (en) * 2019-06-10 2021-07-16 北京百度网讯科技有限公司 Method and device for pushing model
CN111967594A (en) * 2020-08-06 2020-11-20 苏州浪潮智能科技有限公司 Neural network compression method, device, equipment and storage medium
CN112287968A (en) * 2020-09-23 2021-01-29 深圳云天励飞技术股份有限公司 Image model training method, image processing method, chip, device and medium
CN112288032B (en) * 2020-11-18 2022-01-14 上海依图网络科技有限公司 Method and device for quantitative model training based on generation of confrontation network
CN113505774B (en) * 2021-07-14 2023-11-10 众淼创新科技(青岛)股份有限公司 Policy identification model size compression method
CN115526266B (en) * 2022-10-18 2023-08-29 支付宝(杭州)信息技术有限公司 Model Training Method and Device, Service Prediction Method and Device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331738A (en) * 2014-10-21 2015-02-04 西安电子科技大学 Network reconfiguration algorithm based on game theory and genetic algorithm
CN104661037A (en) * 2013-11-19 2015-05-27 中国科学院深圳先进技术研究院 Tampering detection method and system for compressed image quantization table
WO2015089148A2 (en) * 2013-12-13 2015-06-18 Amazon Technologies, Inc. Reducing dynamic range of low-rank decomposition matrices
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN106251347A (en) * 2016-07-27 2016-12-21 广东工业大学 subway foreign matter detecting method, device, equipment and subway shield door system
CN106503799A (en) * 2016-10-11 2017-03-15 天津大学 Deep learning model and the application in brain status monitoring based on multiple dimensioned network
EP3168781A1 (en) * 2015-11-16 2017-05-17 Samsung Electronics Co., Ltd. Method and apparatus for recognizing object, and method and apparatus for training recognition model
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method
CN106845381A (en) * 2017-01-16 2017-06-13 西北工业大学 Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170337711A1 (en) * 2011-03-29 2017-11-23 Lyrical Labs Video Compression Technology, LLC Video processing and encoding
GB2495265A (en) * 2011-07-07 2013-04-10 Toyota Motor Europe Nv Sa Artificial memory system for predicting behaviours in order to assist in the control of a system, e.g. stability control in a vehicle
US10204286B2 (en) * 2016-02-29 2019-02-12 Emersys, Inc. Self-organizing discrete recurrent network digital image codec

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104661037A (en) * 2013-11-19 2015-05-27 中国科学院深圳先进技术研究院 Tampering detection method and system for compressed image quantization table
WO2015089148A2 (en) * 2013-12-13 2015-06-18 Amazon Technologies, Inc. Reducing dynamic range of low-rank decomposition matrices
CN104331738A (en) * 2014-10-21 2015-02-04 西安电子科技大学 Network reconfiguration algorithm based on game theory and genetic algorithm
EP3168781A1 (en) * 2015-11-16 2017-05-17 Samsung Electronics Co., Ltd. Method and apparatus for recognizing object, and method and apparatus for training recognition model
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN106251347A (en) * 2016-07-27 2016-12-21 广东工业大学 subway foreign matter detecting method, device, equipment and subway shield door system
CN106503799A (en) * 2016-10-11 2017-03-15 天津大学 Deep learning model and the application in brain status monitoring based on multiple dimensioned network
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method
CN106845381A (en) * 2017-01-16 2017-06-13 西北工业大学 Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression;Jian-Hao Luo等;《Computer Vision Foundation》;20171231;第5058-5065页 *
深度神经网络压缩与优化研究;王征韬;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20180215(第02期);第I140-289页 *

Also Published As

Publication number Publication date
CN108510083A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
CN108510083B (en) Neural network model compression method and device
US20200097818A1 (en) Method and system for training binary quantized weight and activation function for deep neural networks
CN112084331A (en) Text processing method, text processing device, model training method, model training device, computer equipment and storage medium
Sarkhel et al. A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition
US11663483B2 (en) Latent space and text-based generative adversarial networks (LATEXT-GANs) for text generation
CN111126488B (en) Dual-attention-based image recognition method
Wang et al. Towards evolutionary compression
EP3982275A1 (en) Image processing method and apparatus, and computer device
US20230085401A1 (en) Method of training an image classification model
Feng et al. Evolutionary fuzzy particle swarm optimization vector quantization learning scheme in image compression
EP3295381B1 (en) Augmenting neural networks with sparsely-accessed external memory
CN111241287A (en) Training method and device for generating generation model of confrontation text
CN107480143A (en) Dialogue topic dividing method and system based on context dependence
CN112464004A (en) Multi-view depth generation image clustering method
CN111352965A (en) Training method of sequence mining model, and processing method and equipment of sequence data
CN112446888A (en) Processing method and processing device for image segmentation model
CN113632106A (en) Hybrid precision training of artificial neural networks
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
US11373043B2 (en) Technique for generating and utilizing virtual fingerprint representing text data
CN114282059A (en) Video retrieval method, device, equipment and storage medium
Liu et al. Efficient neural networks for edge devices
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
Sun et al. Deep Evolutionary 3D Diffusion Heat Maps for Large-pose Face Alignment.
Tang et al. Bringing giant neural networks down to earth with unlabeled data
US11914670B2 (en) Methods and systems for product quantization-based compression of a matrix

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District

Applicant after: Guoxin Youyi Data Co., Ltd

Address before: 100070, No. 188, building 31, headquarters square, South Fourth Ring Road West, Fengtai District, Beijing

Applicant before: SIC YOUE DATA Co.,Ltd.

GR01 Patent grant
GR01 Patent grant