CN108764462A - A kind of convolutional neural networks optimization method of knowledge based distillation - Google Patents

A kind of convolutional neural networks optimization method of knowledge based distillation Download PDF

Info

Publication number
CN108764462A
CN108764462A CN201810530304.7A CN201810530304A CN108764462A CN 108764462 A CN108764462 A CN 108764462A CN 201810530304 A CN201810530304 A CN 201810530304A CN 108764462 A CN108764462 A CN 108764462A
Authority
CN
China
Prior art keywords
fpn
networks
network
convolutional neural
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810530304.7A
Other languages
Chinese (zh)
Inventor
王标
隆刚
史方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu View World Science And Technology Co Ltd
Original Assignee
Chengdu View World Science And Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu View World Science And Technology Co Ltd filed Critical Chengdu View World Science And Technology Co Ltd
Priority to CN201810530304.7A priority Critical patent/CN108764462A/en
Publication of CN108764462A publication Critical patent/CN108764462A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of convolutional neural networks optimization methods of knowledge based distillation, and chosen position establishes bridge joint from the additional structure of the feature pyramid part of FPN;Multiple feature adaptation layers are established in bridge joint position between teacher FPN networks T and student's FPN networks S;It is used as the loss function of network training using the multiple dimensioned loss function of level weighting.The positive effect of the present invention is:On the one hand, the knowledge based on the present invention distills design, can compress complicated teacher's FPN networks, obtains a scale smaller, calculates faster student FPN networks.FPN is directly used compared to the existing target detection technique based on CNN, is more convenient for carrying out the deployment of edge side calculating;On the other hand the mode for considering knowledge distillation, compared with existing knowledge distillation technique, the present invention can preferably be adapted to multiscale target detection network FPN, can preferably train student's FPN networks of high quality.

Description

A kind of convolutional neural networks optimization method of knowledge based distillation
Technical field
The present invention relates to a kind of convolutional neural networks optimization methods of knowledge based distillation.
Background technology
At present, deep learning is due to its powerful characterization ability, and the feature of extraction is compared to the spy that conventional method constructs by hand Sign has stronger robustness, therefore is widely used in figure with the depth learning technology that convolutional neural networks (CNN) are representative As in a variety of traditional computer visual tasks such as classification, target detection, image segmentation.Wherein, the CNN applications in image classification Typical way is to carry out the training of CNN models as loss function based on cross-entropy method.
In recent years, three main trend are presented in the development of deep learning:Model structure is increasingly sophisticated, model hierarchy is constantly deepened, Mass data collection continues to develop.However, with continuous with the demand of CNN progress edge calculations in mobile terminal and embedded platform It is soaring, since edge side computing platform is resource-constrained, it is desirable that CNN models are small as far as possible, and computational efficiency is high.For this purpose, close Nian Lai academias and industrial quarters propose various types of model compression methods, as model beta pruning, low-rank decomposition, model parameter are low Accuracy quantification etc..The paper Distilling the Knowledge in a Neural Network in 2014 such as Hinton In, it is proposed that a kind of method of knowledge distillation, can using a large-scale CNN that training obtains on large-scale training dataset as Teacher's network, and using a Small-sized C NN as student network, the ProbabilityDistribution Vector exported by joint training teacher's network and The artificial mark of training set, is trained student network.They, which demonstrate this method, can overcome Small-sized C NN in big data Training difficulty on collection can obtain the test result near or above teacher's network in classification task after the completion of training.It should Method can be considered as a kind of means of knowledge migration, and knowledge is passed through from teacher's network in training transfer to student network.It completes Big and bulky teacher's network is replaced by the student network of small fast spirit by design object after migration, carries out task utilization, to Deployment of the deep learning in edge side platform is facilitated significantly.
After Hinton etc. proposes knowledge distillation theory, Romero etc. proposes a kind of new method, i.e., by teacher Characteristic matching is carried out in the middle layer of network and student network, to provide a kind of reliable supervisory signals (hint), to tool There is the training of the student network of big depth structure effectively to be instructed, to overcome the weakness of Hinton methods so that big The knowledge of depth network is distilled into possibility.
Existing knowledge distillation technique is studied primarily directed to the CNN networks for classification at present.On the other hand, in depth The application of learning art is spent, such as is flourished at present by the related application of representative of recognition of face, various scene layers go out Not poor, this proposes challenge for the robustness of Face datection algorithm.For example, the difference due to face apart from camera distance, Larger different scale is presented in different moments in different people or same people in the picture.Similar, other kinds of target inspection Survey can also have this problem under special scenes.Even if many algorithm of target detection are based on CNN, for detection ruler The smaller target performance of degree is still not good enough.Therefore, Lin etc. proposes a kind of multiple dimensioned target detection network, referred to as feature gold Word tower network (FPN), for backbone network (backbone) typical case, based on using ResNet as the big depth model of representative, together Shi Ronghe Analysis On Multi-scale Features detect target, extraordinary detection performance are achieved, relative to traditional algorithm in small target deteection etc. Aspect embodies advantage.But due to using this larger backbone networks of similar ResNet, if in edge side platform part It affixes one's name to FPN and then there are many limitations such as calculating, storage.The present invention carries out knowledge distillation for FPN, in the target detection for maintaining FPN While performance, its network size and computational load are reduced, to be conducive to it in edge side Platform deployment.
The background of FPN:
The case where Fig. 1 is typical target detection network such as Faster R-CNN, input picture from its backbone network (such as ResNet first layer) enters, and data carry out the relevant prediction of target detection after being propagated forward to last layer always, therefore is a kind of Structure under single scale, and not comprising the multiple dimensioned design of pyramid.
Fig. 2 is the skeleton diagram of FPN, it is seen that data are in addition to common backbone bottom-up, from first layer to last layer Within propagation other than, also top-down connection, and horizontal connection is (general to generate additional feature pyramid Sketch map right half part).
Present to Fig. 3 details the horizontal connection between feature pyramidal layer and top-down connection, it is seen that FPN is in backbone Additional 1x1 convolutional layers, up-sampling and Eltwise layer are increased except net.
Since FPN can just be produced general in the pyramidal each layer of progress relevant prediction of target detection of feature The advantage of multiscale target detection not available for logical target detection network.
For common target detection network, Chen etc. is in document Learning Efficient Object A kind of method of knowledge distillation is proposed in Detection Models with Knowledge Distillation.Wherein, In view of the difference of object detection task and basic classification task, the loss letter needed for trained network is pointedly devised The feature adaptation layer being attached to over the backbone when counting, and training between teacher's network and student network.
But FPN detects network as a kind of special objective of Multi-scale model, is not particularly suited for directly using the side of the document Method carries out knowledge distillation.For example, the feature adaptation layer of Chen documents design, is directly bridge joint teacher's network and over the backbone Raw network carries out the intermediate features adaptation between them.Due to its design, for Detection task feature also really in backbone It generates on the net, therefore is suitable.However FPN is but and indirect using the feature exported in backbone network, but in backbone network Except increase additional additional structure, and what the feature for applying to Detection task exactly exported in these structures, therefore If the above method is equally used for FPN just and is not suitable for.
For another example, only there are one feature adaptation layers for literature method, because it is based only upon the task of single scale, therefore can not be thick It is non-.But FPN is different, it is that multiple features outputs are used for multiple dimensioned task, so if feature adaptation layer there are one only, it may not It can ensure the matched well of Analysis On Multi-scale Features between teacher's network and student network well.
Invention content
The shortcomings that in order to overcome the prior art, the present invention provides a kind of convolutional neural networks optimizations of knowledge based distillation Method, the present invention devise a kind of multiple dimensioned knowledge distillating method, so that it is preferably suitable for multiscale target and detect network FPN。
The technical solution adopted in the present invention is:A kind of convolutional neural networks optimization method of knowledge based distillation, from FPN Feature pyramid part additional structure in chosen position establish bridge joint;Between teacher FPN networks T and student's FPN networks S Bridge joint position establish multiple feature adaptation layers;It is used as the loss letter of network training using the multiple dimensioned loss function of level weighting Number.
Compared with prior art, the positive effect of the present invention is:
On the one hand, the knowledge based on the present invention distills design, training of students FPN networks, energy under teacher's FPN guiding via networks It is enough to compress complicated teacher's FPN networks, it obtains a scale smaller, calculate faster student FPN networks.Compared to existing There is the target detection technique based on CNN directly to use FPN, is more convenient for carrying out the deployment of edge side calculating.
On the other hand consider the mode of knowledge distillation, compared with existing knowledge distillation technique, the present invention can be fitted preferably Network FPN is detected with multiscale target, can preferably train student's FPN networks of high quality.
Description of the drawings
Examples of the present invention will be described by way of reference to the accompanying drawings, wherein:
The case where Fig. 1 is typical target detection network;
Fig. 2 is the skeleton diagram of FPN;
Fig. 3 is characterized horizontal connection and top-down connection diagram between pyramidal layer.
Specific implementation mode
The method of the present invention includes following content:
First, the present invention designs feature adaptation layer on the additional structure of backbone network, to be different from Chen documents in bone Dry network portion carries out the feature adaptation between teacher's network and student network.
Secondly, what the present invention designed on backbone network additional structure is multiple (optional) feature adaptation layers, to be based on They between teacher's network and student network, carry out and Analysis On Multi-scale Features pyramid be adapted, between multilayer feature in pairs Characteristic matching.
Third, in the design aspect of loss function, the present invention considers the function of FPN multiscale targets detection, devises A kind of multiple dimensioned loss function of multi-level loss weighting, to be different from Chen documents.
Specifically, main separation structure, loss function two large divisions discuss.
First in structure, the present invention take with feature adaptation method as Chen document categories, i.e., in teacher's network and On bridge joint position between raw network feature adaptation is carried out with 1x1 convolutional layers.The effect of 1x1 convolution is mainly so that as input Teacher's network middle layer characteristic pattern (feature map) port number, adaptation as export student network middle layer feature The port number of figure.
But unlike Chen document essence, the present invention is not the chosen position foundation bridge joint from backbone network, but Chosen position establishes bridge joint from the additional structure of the feature pyramid part of FPN.
Assuming that teacher's FPN network T additional structures output characteristic pattern quantity is Nt, and student's FPN network S additional structures export Characteristic pattern quantity is Ns, then the present invention establishes n feature adaptation layer between T and S, wherein 1≤n≤N, N=min (Nt, Ns). In other words, it is assumed that the additional structure output characteristic pattern of student network is less, then the quantitative range of the invention for establishing adaptation layer exists Between [1, Ns];If the output characteristic pattern of teacher's network building-out structure on the contrary is less, the quantitative range of adaptation layer is in [1, Nt] Between.
Assuming that Nt and Ns is 4, then the value range of adaptation layer is 1~4.
Using in FPN originals by ResNet as in case of backbone network, it is assumed that T is identical as S structures, if C2, C3, C4, C5 } it is conv2, conv3, conv4 in ResNet, the output characteristic pattern of these convolutional layers of conv5, and FPN is additional { P2, P3, P4, P5 } in structure is respectively and { C2, C3, C4, C5 } an equal amount of output characteristic pattern on backbone network.If point { P2, P3, P4, P5 } on { P2, P3, P4, P5 } and S on T is not distinguished with suffix _ t and suffix _ s, then the present embodiment is based on { P2_t, P3_t, P4_t, P5_t } and { P2_s, P3_s, P4_s, P5_s } these characteristic patterns select the positioning of adaptation layer.
Such as take n=1, then the adaptation layer position that the present embodiment selects for set (P2_t, P2_s), (P3_t, P3_s), (P4_t, P4_s), (P5_t, P5_s) } in any one.By taking (P2_t, P2_s) as an example, mean using the P2_t of T as adaptation The input of layer, is output to after 1x1 convolution in S, and characteristic matching is carried out with the P2_s of S, and so on.Preferably, selection is suitable It is (P3_t, P3_s) or (P4_t, P4_s) with layer position, i.e., in optional position, opposite centered position is selected to establish bridge joint.
Such as take n=4, then the position of 4 feature adaptation layers be (P2_t, P2_s), (P3_t, P3_s), (P4_t, P4_s), (P5_t,P5_s)}。
If n takes medium value, such as n=2, then preferably, a position can be selected in { (P2_t, P2_s), (P3_t, P3_s) } Vertical bridge joint is set up, a position is then selected to establish bridge joint again in { (P4_t, P4_s), (P5_t, P5_s) }.
Assuming that establishing multiple bridge joints between T and S, then any two of which bridges (Pi_t, Pk_s), (Pj_t, Pl_s) Position relationship must meet constraint:(1) i is not equal to j, and k is not equal to 1;(2) if i>J then needs k>1.Each bridge is ensured in this way The position connect is without coincidence, and there is no intersecting on characteristic dimension for each bridge joint position between T and S.
Second, in the design of the loss function for network training.The method of Chen documents is, in target detection network In RPN (region recommendation network) partly loss item L with RCN (territorial classification and frame return) partRPNAnd LRCNRespectively:
Wherein λ is hyper parameter, and N is the batch sizes of RCN, and M is the batch sizes of RPN, Classification Loss LclsIt is to be based on The combination of the soft loss of the softmax of ground truth labels losses firmly and knowledge based distillation, frame return loss Lreg It is then the combination of smooth L1 losses and the L2 losses of teacher's network limit.
The present invention is in loss item LRPNAnd LRCNDefinition in terms of, used for reference Chen documents.And the total losses function L of the document For:
L=LRPN+LRCN+γLhint
Wherein, γ is hyper parameter, LHintIt is aspect ratio between teacher's network and student network after feature adaptation to damage It loses.As it can be seen that L is the summation of RPN losses, RCN losses and aspect ratio to loss.
The present invention is based on similar thoughts, are optimized further combined with the multiple dimensioned feature of feature, definition loss Function:
WhereinIt is aspect ratio after n feature adaptation layer to the weighted sum of loss, γiFor hyper parameter, correspond to every Aspect ratio is to loss after a feature adaptation layerWeight.
Wherein, Zi is the middle layer feature of input terminal, that is, teacher's network of current signature adaptation layer, after feature adaptation Feature, Vi be current signature adaptation layer output end (i.e. the middle layer feature of its corresponding student network).
In the present invention, γiPresence can be used for control two kinds balance:First, aspect ratio is to loss after feature adaptation layer The tradeoff of importance between being lost with other types;Second is that aspect ratio is to the tradeoff inside loss, i.e., each ruler after feature adaptation layer Spend the balance of aspect ratio importance between loss.
In the training of network, γ can be passed throughiBe adjusted flexibly, so that loss function is preferably adapted to the inspection of specific target Survey task.Preferably, it is assumed that γ2And γ3Respectively on corresponding position (P2_t, P2_s) and (P3_t, P3_s) after feature adaptation Aspect ratio is to losing weight, such as when the small scaled target for needing to detect in specific tasks is relatively more, then can set γ23, To reinforce optimizing small scaled target detection, if otherwise large scale target can more at most set γ23, and so on.
As described in FPN original papers, entirety is a kind of generic structure, therefore backbone network in FPN structures involved in the present invention The part of network is equally also not limited to the ResNet of embodiment, can be other depth convolutional neural networks.
The main distinction of the present invention and the prior art is summarized as follows:
1, the position (additional structure of backbone network vs. backbone networks) of feature adaptation layer, than over the backbone closer in appoint The feature of business actual use so that teacher's network can provide more effective supervisory signals (hint) for student network;
2, the quantity (single scale is adapted to the multiple dimensioned adaptations of vs., i.e., the n in given value range is a) of feature adaptation layer, more It is bonded the feature pyramid of FPN marrow;
3, the design (the multiple dimensioned loss function of single scale loss function vs. levels weighting) of loss function, can be preferably It controls the internal i.e. equilibrium of each dimensional losses item of Analysis On Multi-scale Features comparison loss and they loses the balanced of item with other types, Etc..

Claims (8)

1. a kind of convolutional neural networks optimization method of knowledge based distillation, it is characterised in that:From the feature pyramid part of FPN Additional structure in chosen position establish bridge joint;Bridge joint position between teacher FPN networks T and student's FPN networks S is established more A feature adaptation layer;It is used as the loss function of network training using the multiple dimensioned loss function of level weighting.
2. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:? On bridge joint position between teacher FPN networks T and student's FPN networks S feature adaptation is carried out with 1x1 convolutional layers.
3. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:If Establish multiple bridge joints between teacher FPN networks T and student's FPN networks S, then any two of which bridge joint (Pi_t, Pk_s) and The position relationship of (Pj_t, Pl_s) must meet following constraint:(1) i is not equal to j, and k is not equal to 1;(2) if i>J, then k> 1。
4. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:It is special The number n of sign adaptation layer is determined in the following way:1≤n≤N and N=min (Nt, Ns), wherein:Nt indicates teacher's FPN networks T additional structures export characteristic pattern quantity, and Ns indicates that student's FPN network S additional structures export characteristic pattern quantity.
5. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:Institute Stating the multiple dimensioned loss function that level weights is:
Wherein:LRPNAnd LRCNIt is illustrated respectively in the loss item of the RPN part and the parts RCN in target detection network;For Aspect ratio is to the weighted sum of loss, γ after n feature adaptation layeriFor hyper parameter, correspond to aspect ratio after each feature adaptation layer To lossWeight.
6. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that:Institute Aspect ratio is to loss after stating each feature adaptation layerIt is calculated as follows:
Wherein, ZiFor the input terminal of current signature adaptation layer, ViFor the output end of current signature adaptation layer.
7. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that: LRPNAnd LRCNFollowing formula is respectively adopted to calculate:
Wherein:λ is hyper parameter, and N is the batch sizes of RCN, and M is the batch sizes of RPN, Classification Loss LclsIt is to be based on The combination of the soft loss of the softmax of ground truth labels losses firmly and knowledge based distillation, frame return loss Lreg It is the combination of smooth L1 losses and the L2 losses of teacher's network limit.
8. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that:When When needing the small scaled target detected relatively more in specific tasks, then γ is setii+1If otherwise large scale target is more Then set γii+1
CN201810530304.7A 2018-05-29 2018-05-29 A kind of convolutional neural networks optimization method of knowledge based distillation Pending CN108764462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810530304.7A CN108764462A (en) 2018-05-29 2018-05-29 A kind of convolutional neural networks optimization method of knowledge based distillation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810530304.7A CN108764462A (en) 2018-05-29 2018-05-29 A kind of convolutional neural networks optimization method of knowledge based distillation

Publications (1)

Publication Number Publication Date
CN108764462A true CN108764462A (en) 2018-11-06

Family

ID=64003296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810530304.7A Pending CN108764462A (en) 2018-05-29 2018-05-29 A kind of convolutional neural networks optimization method of knowledge based distillation

Country Status (1)

Country Link
CN (1) CN108764462A (en)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635111A (en) * 2018-12-04 2019-04-16 国网江西省电力有限公司信息通信分公司 A kind of news click bait detection method based on network migration
CN109816636A (en) * 2018-12-28 2019-05-28 汕头大学 A kind of crack detection method based on intelligent terminal
CN109886343A (en) * 2019-02-26 2019-06-14 深圳市商汤科技有限公司 Image classification method and device, equipment, storage medium
CN109919110A (en) * 2019-03-13 2019-06-21 北京航空航天大学 Video area-of-interest-detection method, device and equipment
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN110120036A (en) * 2019-04-17 2019-08-13 杭州数据点金科技有限公司 A kind of multiple dimensioned tire X-ray defect detection method
CN110245754A (en) * 2019-06-14 2019-09-17 西安邮电大学 A kind of knowledge distillating method based on position sensing figure
CN110263842A (en) * 2019-06-17 2019-09-20 北京影谱科技股份有限公司 For the neural network training method of target detection, device, equipment, medium
CN110298227A (en) * 2019-04-17 2019-10-01 南京航空航天大学 A kind of vehicle checking method in unmanned plane image based on deep learning
CN111062951A (en) * 2019-12-11 2020-04-24 华中科技大学 Knowledge distillation method based on semantic segmentation intra-class feature difference
CN111179212A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Method for realizing micro target detection chip integrating distillation strategy and deconvolution
CN111178115A (en) * 2018-11-12 2020-05-19 北京深醒科技有限公司 Training method and system of object recognition network
CN111260449A (en) * 2020-02-17 2020-06-09 腾讯科技(深圳)有限公司 Model training method, commodity recommendation device and storage medium
CN111275646A (en) * 2020-01-20 2020-06-12 南开大学 Edge-preserving image smoothing method based on deep learning knowledge distillation technology
CN111312271A (en) * 2020-02-28 2020-06-19 云知声智能科技股份有限公司 Model compression method and system for improving convergence rate and processing performance
WO2020143225A1 (en) * 2019-01-08 2020-07-16 南京人工智能高等研究院有限公司 Neural network training method and apparatus, and electronic device
CN111428191A (en) * 2020-03-12 2020-07-17 五邑大学 Antenna downward inclination angle calculation method and device based on knowledge distillation and storage medium
CN111461212A (en) * 2020-03-31 2020-07-28 中国科学院计算技术研究所 Compression method for point cloud target detection model
CN111476167A (en) * 2020-04-09 2020-07-31 北京中科千寻科技有限公司 student-T distribution assistance-based one-stage direction remote sensing image target detection method
CN111554268A (en) * 2020-07-13 2020-08-18 腾讯科技(深圳)有限公司 Language identification method based on language model, text classification method and device
CN111626330A (en) * 2020-04-23 2020-09-04 南京邮电大学 Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation
CN111967617A (en) * 2020-08-14 2020-11-20 北京深境智能科技有限公司 Machine learning method based on difficult sample learning and neural network fusion
CN112020724A (en) * 2019-04-01 2020-12-01 谷歌有限责任公司 Learning compressible features
CN112052945A (en) * 2019-06-06 2020-12-08 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN112150478A (en) * 2020-08-31 2020-12-29 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN112200062A (en) * 2020-09-30 2021-01-08 广州云从人工智能技术有限公司 Target detection method and device based on neural network, machine readable medium and equipment
CN112487182A (en) * 2019-09-12 2021-03-12 华为技术有限公司 Training method of text processing model, and text processing method and device
CN112508169A (en) * 2020-11-13 2021-03-16 华为技术有限公司 Knowledge distillation method and system
CN112529178A (en) * 2020-12-09 2021-03-19 中国科学院国家空间科学中心 Knowledge distillation method and system suitable for detection model without preselection frame
CN112560693A (en) * 2020-12-17 2021-03-26 华中科技大学 Highway foreign matter identification method and system based on deep learning target detection
CN112560631A (en) * 2020-12-09 2021-03-26 昆明理工大学 Knowledge distillation-based pedestrian re-identification method
CN112731852A (en) * 2021-01-26 2021-04-30 南通大学 Building energy consumption monitoring system based on edge calculation and monitoring method thereof
CN113378866A (en) * 2021-08-16 2021-09-10 深圳市爱深盈通信息技术有限公司 Image classification method, system, storage medium and electronic device
CN113470036A (en) * 2021-09-02 2021-10-01 湖南大学 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation
CN113486185A (en) * 2021-09-07 2021-10-08 中建电子商务有限责任公司 Knowledge distillation method based on joint training, processor and storage medium
CN113505719A (en) * 2021-07-21 2021-10-15 山东科技大学 Gait recognition model compression system and method based on local-integral joint knowledge distillation algorithm
CN113610146A (en) * 2021-08-03 2021-11-05 江西鑫铂瑞科技有限公司 Method for realizing image classification based on knowledge distillation enhanced by interlayer feature extraction
CN113947590A (en) * 2021-10-26 2022-01-18 四川大学 Surface defect detection method based on multi-scale attention guidance and knowledge distillation
CN114998570A (en) * 2022-07-19 2022-09-02 上海闪马智能科技有限公司 Method and device for determining object detection frame, storage medium and electronic device
CN116612378A (en) * 2023-05-22 2023-08-18 河南大学 Unbalanced data and underwater small target detection method under complex background based on SSD improvement
US12020425B2 (en) 2021-12-03 2024-06-25 Contemporary Amperex Technology Co., Limited Fast anomaly detection method and system based on contrastive representation distillation
CN118552739A (en) * 2024-07-30 2024-08-27 山东航天电子技术研究所 Image segmentation model compression method based on hardware perception

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090030897A1 (en) * 2007-07-26 2009-01-29 Hamid Hatami-Hanza Assissted Knowledge Discovery and Publication System and Method
US20140289323A1 (en) * 2011-10-14 2014-09-25 Cyber Ai Entertainment Inc. Knowledge-information-processing server system having image recognition system
CN106650756A (en) * 2016-12-28 2017-05-10 广东顺德中山大学卡内基梅隆大学国际联合研究院 Image text description method based on knowledge transfer multi-modal recurrent neural network
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
CN107358293A (en) * 2017-06-15 2017-11-17 北京图森未来科技有限公司 A kind of neural network training method and device
CN108010030A (en) * 2018-01-24 2018-05-08 福州大学 A kind of Aerial Images insulator real-time detection method based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090030897A1 (en) * 2007-07-26 2009-01-29 Hamid Hatami-Hanza Assissted Knowledge Discovery and Publication System and Method
US20140289323A1 (en) * 2011-10-14 2014-09-25 Cyber Ai Entertainment Inc. Knowledge-information-processing server system having image recognition system
CN106650756A (en) * 2016-12-28 2017-05-10 广东顺德中山大学卡内基梅隆大学国际联合研究院 Image text description method based on knowledge transfer multi-modal recurrent neural network
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
CN107358293A (en) * 2017-06-15 2017-11-17 北京图森未来科技有限公司 A kind of neural network training method and device
CN108010030A (en) * 2018-01-24 2018-05-08 福州大学 A kind of Aerial Images insulator real-time detection method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GUOBIN CHEN: "Learning Efficient Object Detection Models with Knowledge Distillation", 《NIPS"17:PROCEEDINGS OF THE 31ST INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 *
施泽浩: "基于特征金字塔网络的目标检测算法", 《现代计算机》 *

Cited By (68)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111179212A (en) * 2018-11-10 2020-05-19 杭州凝眸智能科技有限公司 Method for realizing micro target detection chip integrating distillation strategy and deconvolution
CN111179212B (en) * 2018-11-10 2023-05-23 杭州凝眸智能科技有限公司 Method for realizing tiny target detection on-chip by integrating distillation strategy and deconvolution
CN111178115B (en) * 2018-11-12 2024-01-12 北京深醒科技有限公司 Training method and system for object recognition network
CN111178115A (en) * 2018-11-12 2020-05-19 北京深醒科技有限公司 Training method and system of object recognition network
CN109635111A (en) * 2018-12-04 2019-04-16 国网江西省电力有限公司信息通信分公司 A kind of news click bait detection method based on network migration
CN109816636A (en) * 2018-12-28 2019-05-28 汕头大学 A kind of crack detection method based on intelligent terminal
CN109816636B (en) * 2018-12-28 2020-11-27 汕头大学 Crack detection method based on intelligent terminal
WO2020143225A1 (en) * 2019-01-08 2020-07-16 南京人工智能高等研究院有限公司 Neural network training method and apparatus, and electronic device
CN109886343B (en) * 2019-02-26 2024-01-05 深圳市商汤科技有限公司 Image classification method and device, equipment and storage medium
CN109886343A (en) * 2019-02-26 2019-06-14 深圳市商汤科技有限公司 Image classification method and device, equipment, storage medium
CN109919110B (en) * 2019-03-13 2021-06-04 北京航空航天大学 Video attention area detection method, device and equipment
CN109919110A (en) * 2019-03-13 2019-06-21 北京航空航天大学 Video area-of-interest-detection method, device and equipment
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN112020724A (en) * 2019-04-01 2020-12-01 谷歌有限责任公司 Learning compressible features
US12033077B2 (en) 2019-04-01 2024-07-09 Google Llc Learning compressible features
CN110298227B (en) * 2019-04-17 2021-03-30 南京航空航天大学 Vehicle detection method in unmanned aerial vehicle aerial image based on deep learning
CN110298227A (en) * 2019-04-17 2019-10-01 南京航空航天大学 A kind of vehicle checking method in unmanned plane image based on deep learning
CN110120036A (en) * 2019-04-17 2019-08-13 杭州数据点金科技有限公司 A kind of multiple dimensioned tire X-ray defect detection method
CN112052945B (en) * 2019-06-06 2024-04-16 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN112052945A (en) * 2019-06-06 2020-12-08 北京地平线机器人技术研发有限公司 Neural network training method, neural network training device and electronic equipment
CN110245754A (en) * 2019-06-14 2019-09-17 西安邮电大学 A kind of knowledge distillating method based on position sensing figure
CN110245754B (en) * 2019-06-14 2021-04-06 西安邮电大学 Knowledge distillation guiding method based on position sensitive graph
CN110263842A (en) * 2019-06-17 2019-09-20 北京影谱科技股份有限公司 For the neural network training method of target detection, device, equipment, medium
CN110263842B (en) * 2019-06-17 2022-04-05 北京影谱科技股份有限公司 Neural network training method, apparatus, device, and medium for target detection
CN112487182B (en) * 2019-09-12 2024-04-12 华为技术有限公司 Training method of text processing model, text processing method and device
CN112487182A (en) * 2019-09-12 2021-03-12 华为技术有限公司 Training method of text processing model, and text processing method and device
CN111062951B (en) * 2019-12-11 2022-03-25 华中科技大学 Knowledge distillation method based on semantic segmentation intra-class feature difference
CN111062951A (en) * 2019-12-11 2020-04-24 华中科技大学 Knowledge distillation method based on semantic segmentation intra-class feature difference
CN111275646A (en) * 2020-01-20 2020-06-12 南开大学 Edge-preserving image smoothing method based on deep learning knowledge distillation technology
CN111260449B (en) * 2020-02-17 2023-04-07 腾讯科技(深圳)有限公司 Model training method, commodity recommendation device and storage medium
CN111260449A (en) * 2020-02-17 2020-06-09 腾讯科技(深圳)有限公司 Model training method, commodity recommendation device and storage medium
CN111312271A (en) * 2020-02-28 2020-06-19 云知声智能科技股份有限公司 Model compression method and system for improving convergence rate and processing performance
CN111428191B (en) * 2020-03-12 2023-06-16 五邑大学 Antenna downtilt angle calculation method and device based on knowledge distillation and storage medium
CN111428191A (en) * 2020-03-12 2020-07-17 五邑大学 Antenna downward inclination angle calculation method and device based on knowledge distillation and storage medium
CN111461212A (en) * 2020-03-31 2020-07-28 中国科学院计算技术研究所 Compression method for point cloud target detection model
CN111461212B (en) * 2020-03-31 2023-04-07 中国科学院计算技术研究所 Compression method for point cloud target detection model
CN111476167B (en) * 2020-04-09 2024-03-22 北京中科千寻科技有限公司 One-stage direction remote sensing image target detection method based on student-T distribution assistance
CN111476167A (en) * 2020-04-09 2020-07-31 北京中科千寻科技有限公司 student-T distribution assistance-based one-stage direction remote sensing image target detection method
CN111626330A (en) * 2020-04-23 2020-09-04 南京邮电大学 Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation
CN111554268B (en) * 2020-07-13 2020-11-03 腾讯科技(深圳)有限公司 Language identification method based on language model, text classification method and device
CN111554268A (en) * 2020-07-13 2020-08-18 腾讯科技(深圳)有限公司 Language identification method based on language model, text classification method and device
CN111967617A (en) * 2020-08-14 2020-11-20 北京深境智能科技有限公司 Machine learning method based on difficult sample learning and neural network fusion
CN111967617B (en) * 2020-08-14 2023-11-21 北京深境智能科技有限公司 Machine learning method based on difficult sample learning and neural network fusion
CN112150478A (en) * 2020-08-31 2020-12-29 温州医科大学 Method and system for constructing semi-supervised image segmentation framework
CN112200062B (en) * 2020-09-30 2021-09-28 广州云从人工智能技术有限公司 Target detection method and device based on neural network, machine readable medium and equipment
CN112200062A (en) * 2020-09-30 2021-01-08 广州云从人工智能技术有限公司 Target detection method and device based on neural network, machine readable medium and equipment
CN112508169A (en) * 2020-11-13 2021-03-16 华为技术有限公司 Knowledge distillation method and system
CN112529178B (en) * 2020-12-09 2024-04-09 中国科学院国家空间科学中心 Knowledge distillation method and system suitable for detection model without preselection frame
CN112529178A (en) * 2020-12-09 2021-03-19 中国科学院国家空间科学中心 Knowledge distillation method and system suitable for detection model without preselection frame
CN112560631A (en) * 2020-12-09 2021-03-26 昆明理工大学 Knowledge distillation-based pedestrian re-identification method
CN112560693B (en) * 2020-12-17 2022-06-17 华中科技大学 Highway foreign matter identification method and system based on deep learning target detection
CN112560693A (en) * 2020-12-17 2021-03-26 华中科技大学 Highway foreign matter identification method and system based on deep learning target detection
CN112731852B (en) * 2021-01-26 2022-03-22 南通大学 Building energy consumption monitoring system based on edge calculation and monitoring method thereof
CN112731852A (en) * 2021-01-26 2021-04-30 南通大学 Building energy consumption monitoring system based on edge calculation and monitoring method thereof
CN113505719B (en) * 2021-07-21 2023-11-24 山东科技大学 Gait recognition model compression system and method based on local-integral combined knowledge distillation algorithm
CN113505719A (en) * 2021-07-21 2021-10-15 山东科技大学 Gait recognition model compression system and method based on local-integral joint knowledge distillation algorithm
CN113610146B (en) * 2021-08-03 2023-08-04 江西鑫铂瑞科技有限公司 Method for realizing image classification based on knowledge distillation with enhanced intermediate layer feature extraction
CN113610146A (en) * 2021-08-03 2021-11-05 江西鑫铂瑞科技有限公司 Method for realizing image classification based on knowledge distillation enhanced by interlayer feature extraction
CN113378866A (en) * 2021-08-16 2021-09-10 深圳市爱深盈通信息技术有限公司 Image classification method, system, storage medium and electronic device
CN113470036A (en) * 2021-09-02 2021-10-01 湖南大学 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation
CN113470036B (en) * 2021-09-02 2021-11-23 湖南大学 Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation
CN113486185A (en) * 2021-09-07 2021-10-08 中建电子商务有限责任公司 Knowledge distillation method based on joint training, processor and storage medium
CN113947590A (en) * 2021-10-26 2022-01-18 四川大学 Surface defect detection method based on multi-scale attention guidance and knowledge distillation
US12020425B2 (en) 2021-12-03 2024-06-25 Contemporary Amperex Technology Co., Limited Fast anomaly detection method and system based on contrastive representation distillation
CN114998570A (en) * 2022-07-19 2022-09-02 上海闪马智能科技有限公司 Method and device for determining object detection frame, storage medium and electronic device
CN116612378A (en) * 2023-05-22 2023-08-18 河南大学 Unbalanced data and underwater small target detection method under complex background based on SSD improvement
CN116612378B (en) * 2023-05-22 2024-07-05 河南大学 Unbalanced data and underwater small target detection method under complex background based on SSD improvement
CN118552739A (en) * 2024-07-30 2024-08-27 山东航天电子技术研究所 Image segmentation model compression method based on hardware perception

Similar Documents

Publication Publication Date Title
CN108764462A (en) A kind of convolutional neural networks optimization method of knowledge based distillation
CN111476294B (en) Zero sample image identification method and system based on generation countermeasure network
CN108564029B (en) Face attribute recognition method based on cascade multitask learning deep neural network
CN108805200B (en) Optical remote sensing scene classification method and device based on depth twin residual error network
CN111488474B (en) Fine-grained freehand sketch image retrieval method based on attention enhancement
Fang et al. DART: Domain-adversarial residual-transfer networks for unsupervised cross-domain image classification
CN112651406B (en) Depth perception and multi-mode automatic fusion RGB-D significance target detection method
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN109241982A (en) Object detection method based on depth layer convolutional neural networks
CN106570522B (en) Object recognition model establishing method and object recognition method
CN110321859A (en) A kind of optical remote sensing scene classification method based on the twin capsule network of depth
CN107292352A (en) Image classification method and device based on convolutional neural networks
CN112365514A (en) Semantic segmentation method based on improved PSPNet
Li et al. A review of deep learning methods for pixel-level crack detection
CN112801138B (en) Multi-person gesture estimation method based on human body topological structure alignment
Wang et al. Hyperspectral image classification via deep network with attention mechanism and multigroup strategy
CN116012722A (en) Remote sensing image scene classification method
CN115223017B (en) Multi-scale feature fusion bridge detection method based on depth separable convolution
Hu et al. Supervised multi-scale attention-guided ship detection in optical remote sensing images
CN111815680A (en) Deep convolutional neural network automatic horizon tracking method based on constant fast mapping
CN115527098A (en) Infrared small target detection method based on global mean contrast space attention
CN118230175A (en) Real estate mapping data processing method and system based on artificial intelligence
Wang et al. Underground defects detection based on GPR by fusing simple linear iterative clustering phash (SLIC-phash) and convolutional block attention module (CBAM)-YOLOv8
CN117671364A (en) Model processing method and device for image recognition, electronic equipment and storage medium
Zhang et al. A small target detection algorithm based on improved YOLOv5 in aerial image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20220701