CN113112020B - Model network extraction and compression method based on generation network and knowledge distillation - Google Patents
Model network extraction and compression method based on generation network and knowledge distillation Download PDFInfo
- Publication number
- CN113112020B CN113112020B CN202110320646.8A CN202110320646A CN113112020B CN 113112020 B CN113112020 B CN 113112020B CN 202110320646 A CN202110320646 A CN 202110320646A CN 113112020 B CN113112020 B CN 113112020B
- Authority
- CN
- China
- Prior art keywords
- network
- teacher
- trained
- generated
- student
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000013140 knowledge distillation Methods 0.000 title claims abstract description 43
- 230000006835 compression Effects 0.000 title claims abstract description 24
- 238000007906 compression Methods 0.000 title claims abstract description 24
- 238000000605 extraction Methods 0.000 title claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 34
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 238000013461 design Methods 0.000 description 5
- 238000004821 distillation Methods 0.000 description 5
- 238000013138 pruning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/027—Frames
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a model network extraction and compression method based on a generation network and knowledge distillation, which comprises the following steps: training a loss function of the generated network by using the trained teacher network to obtain a trained generated network; generating a plurality of generated pictures according to the generation network; inputting the generated pictures into a trained teacher network and a trained student network, and carrying out knowledge distillation on the student network; updating the student network; when facing a large network, the method provided by the invention can only learn the classification knowledge of specific categories in the large network according to different tasks and migrate the classification knowledge to a smaller network. Meanwhile, the method of the invention can rely on the data less, and the knowledge distillation is carried out under the condition of no data, thereby reducing the dependence of the original knowledge distillation on the real data.
Description
Technical Field
The invention relates to the field of error compensation, in particular to a model network extraction and compression method based on a generation network and knowledge distillation.
Background
In the field of artificial intelligence, in order to solve different problems, people propose more and more complex network structures, the network scale is larger and larger, and meanwhile, the problem is that in practical projects, due to the limitation of hardware resource computing capacity and the like, a large-scale network with excellent performance is difficult to apply, so that a plurality of methods such as knowledge distillation and the like can compress and accelerate the trained large-scale network. Meanwhile, for a trained network, according to the actual requirements, in many cases, the requirements of the desired network may not be all task targets of the original network, but only some task targets thereof, for example, a large network is available to implement 1000 classes of classification tasks on ImageNet, however, in actual applications, the task targets may not need the above 1000 classes but 10 classes of task targets.
For network compression and speed-up technologies, some classical methods are researched and improved at present. Related researchers put forward a new hash coding mode for the neural network to accelerate the operation of the network, and carry out hash mapping on parameters, wherein the parameters in the same hash bucket contribute to a weight value. The positions of a model pruning filter and a full connection layer are determined by evaluating the filter and the full connection neurons in the trained network. In addition, regularization induction updating is carried out on the weights through kernel sparsification, so that the weights of the kernels are more sparse and then the kernels are conveniently cut. Besides reducing the capacity of the models by pruning in the trained models through various methods, Hinton proposes the concept of knowledge distillation, and the aim of transferring the knowledge learned by the teacher network to a smaller student network is fulfilled by fitting the teacher model by enabling the output labels of the student models and the output labels of the teacher model to be as close as possible. Compared with the method, the method can be separated from the limit of the model structure between model compression.
At present, some classical network compression methods are researched, some methods are pruning operations based on an original model, some methods are network acceleration by using kernel sparseness, and some methods are compression by redesigning a smaller network by using distillation.
Although the model after network compression can be free from the constraint of the original trained model directly through technologies such as distillation, the distillation process has strong dependence on the original data set. And the task targets of the network before and after distillation are not changed, and only the knowledge of the network part can not be migrated.
Disclosure of Invention
The invention mainly aims to overcome the defects in the prior art, and provides a model network extraction and compression method based on a generated network and knowledge distillation, so that when a large network is faced, only specific classification knowledge in the large network can be learned according to different tasks and transferred to a smaller network; meanwhile, the data can be relied on less, knowledge distillation is carried out under the condition of no data, and the dependence of original knowledge distillation on real data is reduced.
The invention adopts the following technical scheme:
A model network extraction and compression method based on generation network and knowledge distillation comprises the following steps:
training a loss function of the generated network by using the trained teacher network to obtain a trained generated network;
generating a plurality of generated pictures according to the generation network;
inputting the generated pictures into a trained teacher network and a trained student network, and carrying out knowledge distillation on the student network;
and updating the student network.
Specifically, the method for obtaining the trained generated network by using the trained teacher network to train the loss function of the generated network specifically includes:
outputting the classification result of the teacher network generating the network generated picture by using the trained teacher network as feedback;
generating a loss function of the network by using feedback calculation;
the gradient of the loss function is calculated and the parameters of the generator network are updated. And when the output of the picture generated by the generating network to the teacher network and the classification result of the teacher network to the real picture output meet the set requirements, obtaining the trained generating network.
Specifically, a trained teacher network is used for training a loss function of a generated network to obtain the trained generated network, where the loss function specifically is as follows:
wherein, The teacher network generates cross entropy loss of the picture to the generator;outputting the information entropy of the target task;generating the probability that the image is judged as the target category by the teacher network;distance of the network output feature map; alpha, beta, gamma and delta are the weights of the three loss functions, and the value range is 0 to 1.
wherein,for the output of the teacher network for generating the picture,outputting the obtained pseudo label by the teacher network for generating the picture; n is the number of pictures a generator generates a batch.
wherein N is the total number of the trained model task categories; m is the number of task categories of the target part, M<N;piThe n pictures are distinguished as the frequency of the ith category for the teacher network.
wherein the real image is defined asThe image generated by the generator is defined as Generating a mean value of the pictures;the variance of the picture is generated, l is the first of the network.
Specifically, the generated pictures are input into a teacher network and a student network, and knowledge distillation is carried out on the student network, and the method specifically comprises the following steps:
a set of n random vectors z 1,z2,…,znInputting the result into the generated network, and obtaining the output result of the generated network as follows:
respectively inputting the generated pictures into a teacher network and a student network to obtain the output of the teacher networkAnd output of student networkWith knowledge distillation, the optimization objective function of the student network is:
wherein WSIs a parameter of the student network.
As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:
(1) the model network extraction and compression method based on the generated network and knowledge distillation provided by the invention utilizes the trained teacher network to train the loss function of the generated network to obtain the trained generated network; generating a plurality of generated pictures according to the generation network; inputting the generated pictures into a trained teacher network and a trained student network, and carrying out knowledge distillation on the student network; updating the student network; the invention combines the picture generation technology for generating the network and the knowledge distillation technology in network compression, can purposefully distill all kinds of knowledge learned by a large network, and only extracts a part of interested target knowledge to be in a smaller network. Meanwhile, the technology utilizes the generation network to design the loss function which accords with the classification distribution in the middle of the original teacher network, and reduces the dependence on real data in the small network training process.
Drawings
FIG. 1 is a model network extraction and compression method based on a generation network and knowledge distillation provided by an embodiment of the invention.
The invention is described in further detail below with reference to the figures and specific examples.
Detailed Description
The invention is further described below by means of specific embodiments.
Knowledge distillation is a process proposed by Hinton et al that enables knowledge learning between networks, which may be of different structures or of similar structures but of different capacities. Conventional knowledge distillation requires a trained, well-behaved network as a teacher network, which is typically complex, and a smaller network designed based on task requirements as a student network. The knowledge distillation considers that the output result of the last layer of the teacher network contains rich knowledge learned by the model, and the knowledge is reflected by the output distribution, so that the output of the student network is expected to learn the distribution output by the last layer of the teacher network, and the process of transferring the knowledge of the teacher network to a smaller student network is realized by the way.
The invention combines the picture generation technology for generating the network and the knowledge distillation technology in network compression, can purposefully distill all kinds of knowledge learned by a large network, and only extracts a part of interested target knowledge to be in a smaller network. Meanwhile, the technology utilizes the generation network to design the loss function which accords with the classification distribution in the middle of the original teacher network, and reduces the dependence on real data in the small network training process.
Referring to fig. 1, a flowchart of a model network extraction and compression method based on a generated network and knowledge distillation provided in an embodiment of the present invention specifically includes the following steps:
s101: training a loss function of the generated network by using the trained teacher network to obtain a trained generated network;
when a trained teacher network is owned, the teacher network is supposed to contain valuable information in the training process, and the valuable information of the knowledge is represented in the output result of the teacher network for the calculation of input data in the network. The goal is to let the generator learn the output expression of the teacher's network, let the picture generated by the generator be more likely to be considered as a "normal" image by the teacher's network, and be able to successfully identify the category result of the small task object, thereby completing the process of knowledge extraction and transfer. Therefore, the output of the teacher network on the synthesized image can be used as an important index for the learning of the generator, so that the output result of the image generated by the generator on the teacher network, even the result of the middle layer, can approach the result of the flow of the real image in the teacher network;
the learning process of generating the network is as follows: and training parameters of the generated network by using the output result of the teacher network on the generated network generated picture as feedback, so that the output result of the generated picture on the teacher network by the generator is as close as possible to the output result of the real picture on the teacher network. The label for generating the network generated picture is obtained from the result of the teacher network.
Training a loss function of the generated network by using the trained teacher network to obtain the trained generated network, wherein the loss function specifically comprises the following steps:
wherein,the teacher network generates cross entropy loss of the picture to the generator;outputting the information entropy of the target task;generating the probability that the image is judged as the target category by the teacher network;distance of the network output feature map; alpha, beta, gamma and delta are the weights of the three loss functions, and the value range is 0 to 1.
wherein,for the output of the teacher network for generating the picture,outputting the obtained pseudo label by the teacher network for generating the picture; n is the number of pictures a generator generates a batch.
wherein N is the total number of the trained model task categories; m is the number of task categories of the target part, M<N;piThe frequency for distinguishing n pictures as the ith category for the teacher network.
wherein the real image is defined asThe image generated by the generator is defined as Generating a mean value of the pictures;generating a variance of the picture, l being the first of the network;
specifically; the loss function for the design generator is:
the three components of the loss function have their respective optimization objectives: It is possible to make the generated image a composite image in the output layer that is fully compatible with the teacher's network, in other words, byThe generator can be made to learn how to make the generated picture more recognizable by the teacher network to success. Considering the pseudo label given to the generator by the teacher network as a real label, then
Andfrom the angle of the information entropy, the information entropy of the target category is obtained by calculating the output distribution of the teacher network to the target category: in information theory, how much a network output value contains the required information is expressed by quantifying the probability distribution of the output. If the probability that the image generated by the generator is judged to be the target category is higher after the image enters the teacher network, according to the information entropy theory, the uncertainty of the network for outputting the generated image to be the target category is smaller, the information quantity is smaller, the obtained result is also smaller,the smaller the size, but the entropy of the learning target class is still insufficient, and when the class imbalance occurs, the minimum value may still be obtained, but in this case, the balance of the number of samples of the generated pictures cannot be guaranteed. In view of this, introduceWhen the frequency distribution of each category is equal, the loss value is the minimum, namely, the generation network can generate the image of each category in the task target according to the average probability, so that the purpose of generating the image categories in a balanced manner is achieved. Andthe specific expression form is as follows:
to take into account the quality of the generated picture, a regularization term for the image is added to the loss function of the generatorSuppose a true image is defined asThe image generated by the generator is defined asIn order to ensure the similarity of the extracted features of the generated image and the real image in the middle layer of the teacher network, the target problem is converted into minimizing the distance between the feature maps of the generated image and the real image in the middle layer. Assuming that the features extracted by the intermediate layer follow a Gaussian distribution, the regularization of the distances between feature maps can be defined as
When the calculation of the real image is lacked, the mean and variance of the data distribution of the real image can be obtained by using the output of a BN layer in the teacher network, and the formula is expressed as follows:although it is unknown how the teacher network is trained, it can be known that when the network introduces batch processing, it can capture the mean and variance of the network after batch processing the input, so the mean and variance of the network with respect to the real data can be obtained approximately, and therefore the distance of the network output feature map can be defined as:
s102: generating a plurality of generated pictures according to the generation network;
s103: inputting the generated pictures into a trained teacher network and a trained student network, and carrying out knowledge distillation on the student network;
Knowledge distillation is a process proposed by Hinton et al to achieve knowledge learning between networks, which may be of different structures or of similar structures but of different capacities. Traditional knowledge distillation requires a trained, well-behaved network as the teacher's network, which is typically complex, while a smaller network designed based on task requirements is the student's network. The knowledge distillation considers that the output result of the last layer of the teacher network contains rich knowledge learned by the model, and the knowledge is reflected by the output distribution, so that the output of the student network is expected to learn the output distribution of the last layer of the teacher network.
In the technology of the invention, the distillation process comprises the following specific steps: for a set of n random vectors z1,z2,…,znInputting the set into the generation network, and obtaining the output result of the generation network as:
the generated pictures are respectively input into a teacher network and a student network, so that the output of the teacher network can be obtainedAnd output of student networkWith knowledge distillation, the optimization objective function of the student network is:
Wherein WSIs a parameter of the student network.
S104: and updating the student network.
Experiments were performed on three general Image type datasets using model network knowledge extraction and compression techniques based on a combination of generative networks and knowledge distillation, with cifar10, cifar100 and Natural Scene Image Classification, with cifar10 and cifar100 Image data each being 32 x 3 in size and Natural Scene Image Classification Image data being 112 x 3 in size. The task goal of the trained model is image classification, and the task goal of the smaller network model is to achieve classification of some of the image classes in the dataset. The teacher network selects a trained Resnet34 network structure and the student network selects a Resnet18 network structure. The results are shown in the following table:
the method has the advantages that the generated network is utilized to directly transfer part of the task knowledge needed by the trained network knowledge of the large-scale teacher, so that a good partial knowledge distillation process can be performed on the accuracy of part of the task targets of the original model. Moreover, partial task knowledge migration with different classification quantities on the original model has a better effect.
The network knowledge extraction and compression technology based on the combination of the generation network and the knowledge distillation, which is provided by the invention, can only learn the classification knowledge of a specific class in a large network according to different tasks when facing the large network, and migrate the classification knowledge to a smaller network. Meanwhile, the method can rely on the data less, knowledge distillation is carried out under the condition of no data, and the dependence of original knowledge distillation on real data is reduced.
In addition, the method mainly focuses on the directions of model compression and task object category extraction, and due to the improvement of computing capacity of high-speed computing equipment and the sharing convenience of network resources, people can obtain a trained network more and more easily, however, how to extract part of task knowledge in the network and move the task knowledge to a smaller network is a practical problem, the two problems can be solved simultaneously by the method, and the method can flexibly change different practical application requirements. On the basis of possessing a large-scale network which is trained, the method can reduce the possibility that the training of a small-scale network has a better classification effect on part of task targets, and can be more conveniently applied to various systems.
The above description is only an embodiment of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modifications made by using the design concept should fall within the scope of the invention.
Claims (5)
1. A model network extraction and compression method based on generation network and knowledge distillation is characterized by comprising the following steps:
training the trained teacher network to generate a loss function of the network by using a cifar10, a cifar100 and a Natural Scene Image Classification Image data set to obtain a trained generated network, wherein the trained teacher network task target is Image Classification;
generating a plurality of generated pictures according to the generation network;
inputting the generated pictures into a trained teacher network and a trained student network, and carrying out knowledge distillation on the student network;
updating the student network;
training a loss function of the generated network by using the trained teacher network to obtain the trained generated network, wherein the loss function specifically comprises the following steps:
wherein,the teacher network generates cross entropy loss of the picture to the generator;outputting the information entropy of the target task;generating the probability that the image is judged as the target category by the teacher network;distance of the network output feature map; α, β, γ, δ are Andweights of the four loss functions in the loss function of the generator range from 0 to 1;
the generated pictures are input into a teacher network and a student network, knowledge distillation is carried out on the student network, and the method specifically comprises the following steps:
a set of n random vectors z1,z2,…,znAnd inputting the data into the generated network, wherein the output result of the generated network is as follows:
respectively inputting the generated pictures into a teacher network and a student network to obtain the output of the teacher networkAnd output of student networkOptimization objective using knowledge distillation, student networkThe standard function is:
wherein WSIs a parameter of the student network.
2. The model network extraction and compression method based on generation network and knowledge distillation as claimed in claim 1, wherein the trained teacher network is used to train the loss function of the generation network to obtain the trained generation network, and the method specifically comprises:
outputting the classification result of the teacher network generating the network generated picture by using the trained teacher network as feedback;
generating a loss function of the network by using feedback calculation;
calculating the gradient of the loss function, and updating the parameters of the generator network; and when the output of the picture generated by the generating network to the teacher network and the classification result of the teacher network to the real picture output meet the set requirements, obtaining the trained generating network.
3. The method of claim 1, wherein the loss function is a function of model network extraction and compression based on a generation network and knowledge distillationThe method specifically comprises the following steps:
4. The method of claim 3, wherein the loss function is a function of model network extraction and compression based on a generation network and knowledge distillationAndthe method specifically comprises the following steps:
wherein N is the total number of the trained model task categories; m is the number of task categories of the target part, M<N;piThe m pictures are distinguished as the frequency of the ith category for the teacher network.
5. The method of claim 3, wherein the loss function is a function of model network extraction and compression based on a generation network and knowledge distillationThe method specifically comprises the following steps:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110320646.8A CN113112020B (en) | 2021-03-25 | 2021-03-25 | Model network extraction and compression method based on generation network and knowledge distillation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110320646.8A CN113112020B (en) | 2021-03-25 | 2021-03-25 | Model network extraction and compression method based on generation network and knowledge distillation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113112020A CN113112020A (en) | 2021-07-13 |
CN113112020B true CN113112020B (en) | 2022-06-28 |
Family
ID=76712144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110320646.8A Active CN113112020B (en) | 2021-03-25 | 2021-03-25 | Model network extraction and compression method based on generation network and knowledge distillation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113112020B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792606B (en) * | 2021-08-18 | 2024-04-26 | 清华大学 | Low-cost self-supervision pedestrian re-identification model construction method based on multi-target tracking |
CN113688990B (en) * | 2021-09-09 | 2024-08-16 | 贵州电网有限责任公司 | Data-free quantitative training method for power edge calculation classification neural network |
CN114095447B (en) * | 2021-11-22 | 2024-03-12 | 成都中科微信息技术研究院有限公司 | Communication network encryption flow classification method based on knowledge distillation and self-distillation |
CN114897155A (en) * | 2022-03-30 | 2022-08-12 | 北京理工大学 | Integrated model data-free compression method for satellite |
CN115564024B (en) * | 2022-10-11 | 2023-09-15 | 清华大学 | Characteristic distillation method, device, electronic equipment and storage medium for generating network |
CN116594994B (en) * | 2023-03-30 | 2024-02-23 | 重庆师范大学 | Application method of visual language knowledge distillation in cross-modal hash retrieval |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111160533A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Neural network acceleration method based on cross-resolution knowledge distillation |
CN111709476A (en) * | 2020-06-17 | 2020-09-25 | 浪潮集团有限公司 | Knowledge distillation-based small classification model training method and device |
CN111967534A (en) * | 2020-09-03 | 2020-11-20 | 福州大学 | Incremental learning method based on generation of confrontation network knowledge distillation |
CN112116030A (en) * | 2020-10-13 | 2020-12-22 | 浙江大学 | Image classification method based on vector standardization and knowledge distillation |
CN112465111A (en) * | 2020-11-17 | 2021-03-09 | 大连理工大学 | Three-dimensional voxel image segmentation method based on knowledge distillation and countertraining |
-
2021
- 2021-03-25 CN CN202110320646.8A patent/CN113112020B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111160533A (en) * | 2019-12-31 | 2020-05-15 | 中山大学 | Neural network acceleration method based on cross-resolution knowledge distillation |
CN111709476A (en) * | 2020-06-17 | 2020-09-25 | 浪潮集团有限公司 | Knowledge distillation-based small classification model training method and device |
CN111967534A (en) * | 2020-09-03 | 2020-11-20 | 福州大学 | Incremental learning method based on generation of confrontation network knowledge distillation |
CN112116030A (en) * | 2020-10-13 | 2020-12-22 | 浙江大学 | Image classification method based on vector standardization and knowledge distillation |
CN112465111A (en) * | 2020-11-17 | 2021-03-09 | 大连理工大学 | Three-dimensional voxel image segmentation method based on knowledge distillation and countertraining |
Non-Patent Citations (4)
Title |
---|
data-free learning of student networks;Hanting Chen et al.;《arXiv》;20191231;全文 * |
Densely Distilled Flow-Based Knowledge Transfer in Teacher-Student Framework for Image Classification;Ji-Hoon Bae et al.;《 IEEE Transactions on Image Processing》;20200406;第29卷;全文 * |
基于知识蒸馏的超分辨率卷积神经网络压缩方法;高钦泉 等;《计算机应用》;20191118;第39卷(第10期);第2802-2808页 * |
基于量化卷积神经网络的模型压缩方法研究;郝立扬;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20200715(第7期);第I138-1277页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113112020A (en) | 2021-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113112020B (en) | Model network extraction and compression method based on generation network and knowledge distillation | |
CN108564029B (en) | Face attribute recognition method based on cascade multitask learning deep neural network | |
CN105701502B (en) | Automatic image annotation method based on Monte Carlo data equalization | |
CN112446423B (en) | Fast hybrid high-order attention domain confrontation network method based on transfer learning | |
CN114841257B (en) | Small sample target detection method based on self-supervision comparison constraint | |
CN110633708A (en) | Deep network significance detection method based on global model and local optimization | |
CN109816032A (en) | Zero sample classification method and apparatus of unbiased mapping based on production confrontation network | |
CN112418351B (en) | Zero sample learning image classification method based on global and local context sensing | |
CN113487629B (en) | Image attribute editing method based on structured scene and text description | |
CN109635140B (en) | Image retrieval method based on deep learning and density peak clustering | |
CN108710894A (en) | A kind of Active Learning mask method and device based on cluster representative point | |
CN113569895A (en) | Image processing model training method, processing method, device, equipment and medium | |
CN112862015A (en) | Paper classification method and system based on hypergraph neural network | |
CN115937774A (en) | Security inspection contraband detection method based on feature fusion and semantic interaction | |
CN109947948B (en) | Knowledge graph representation learning method and system based on tensor | |
CN112017255A (en) | Method for generating food image according to recipe | |
CN114357307B (en) | News recommendation method based on multidimensional features | |
CN116258990A (en) | Cross-modal affinity-based small sample reference video target segmentation method | |
Fan et al. | A global and local surrogate-assisted genetic programming approach to image classification | |
CN114202021A (en) | Knowledge distillation-based efficient image classification method and system | |
CN116957304A (en) | Unmanned aerial vehicle group collaborative task allocation method and system | |
CN116797850A (en) | Class increment image classification method based on knowledge distillation and consistency regularization | |
Zhu et al. | Incremental classifier learning based on PEDCC-loss and cosine distance | |
CN112990336B (en) | Deep three-dimensional point cloud classification network construction method based on competitive attention fusion | |
He et al. | ECS-SC: Long-tailed classification via data augmentation based on easily confused sample selection and combination |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |