CN108764462A - A kind of convolutional neural networks optimization method of knowledge based distillation - Google Patents
A kind of convolutional neural networks optimization method of knowledge based distillation Download PDFInfo
- Publication number
- CN108764462A CN108764462A CN201810530304.7A CN201810530304A CN108764462A CN 108764462 A CN108764462 A CN 108764462A CN 201810530304 A CN201810530304 A CN 201810530304A CN 108764462 A CN108764462 A CN 108764462A
- Authority
- CN
- China
- Prior art keywords
- fpn
- networks
- network
- convolutional neural
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of convolutional neural networks optimization methods of knowledge based distillation, and chosen position establishes bridge joint from the additional structure of the feature pyramid part of FPN;Multiple feature adaptation layers are established in bridge joint position between teacher FPN networks T and student's FPN networks S;It is used as the loss function of network training using the multiple dimensioned loss function of level weighting.The positive effect of the present invention is:On the one hand, the knowledge based on the present invention distills design, can compress complicated teacher's FPN networks, obtains a scale smaller, calculates faster student FPN networks.FPN is directly used compared to the existing target detection technique based on CNN, is more convenient for carrying out the deployment of edge side calculating;On the other hand the mode for considering knowledge distillation, compared with existing knowledge distillation technique, the present invention can preferably be adapted to multiscale target detection network FPN, can preferably train student's FPN networks of high quality.
Description
Technical field
The present invention relates to a kind of convolutional neural networks optimization methods of knowledge based distillation.
Background technology
At present, deep learning is due to its powerful characterization ability, and the feature of extraction is compared to the spy that conventional method constructs by hand
Sign has stronger robustness, therefore is widely used in figure with the depth learning technology that convolutional neural networks (CNN) are representative
As in a variety of traditional computer visual tasks such as classification, target detection, image segmentation.Wherein, the CNN applications in image classification
Typical way is to carry out the training of CNN models as loss function based on cross-entropy method.
In recent years, three main trend are presented in the development of deep learning:Model structure is increasingly sophisticated, model hierarchy is constantly deepened,
Mass data collection continues to develop.However, with continuous with the demand of CNN progress edge calculations in mobile terminal and embedded platform
It is soaring, since edge side computing platform is resource-constrained, it is desirable that CNN models are small as far as possible, and computational efficiency is high.For this purpose, close
Nian Lai academias and industrial quarters propose various types of model compression methods, as model beta pruning, low-rank decomposition, model parameter are low
Accuracy quantification etc..The paper Distilling the Knowledge in a Neural Network in 2014 such as Hinton
In, it is proposed that a kind of method of knowledge distillation, can using a large-scale CNN that training obtains on large-scale training dataset as
Teacher's network, and using a Small-sized C NN as student network, the ProbabilityDistribution Vector exported by joint training teacher's network and
The artificial mark of training set, is trained student network.They, which demonstrate this method, can overcome Small-sized C NN in big data
Training difficulty on collection can obtain the test result near or above teacher's network in classification task after the completion of training.It should
Method can be considered as a kind of means of knowledge migration, and knowledge is passed through from teacher's network in training transfer to student network.It completes
Big and bulky teacher's network is replaced by the student network of small fast spirit by design object after migration, carries out task utilization, to
Deployment of the deep learning in edge side platform is facilitated significantly.
After Hinton etc. proposes knowledge distillation theory, Romero etc. proposes a kind of new method, i.e., by teacher
Characteristic matching is carried out in the middle layer of network and student network, to provide a kind of reliable supervisory signals (hint), to tool
There is the training of the student network of big depth structure effectively to be instructed, to overcome the weakness of Hinton methods so that big
The knowledge of depth network is distilled into possibility.
Existing knowledge distillation technique is studied primarily directed to the CNN networks for classification at present.On the other hand, in depth
The application of learning art is spent, such as is flourished at present by the related application of representative of recognition of face, various scene layers go out
Not poor, this proposes challenge for the robustness of Face datection algorithm.For example, the difference due to face apart from camera distance,
Larger different scale is presented in different moments in different people or same people in the picture.Similar, other kinds of target inspection
Survey can also have this problem under special scenes.Even if many algorithm of target detection are based on CNN, for detection ruler
The smaller target performance of degree is still not good enough.Therefore, Lin etc. proposes a kind of multiple dimensioned target detection network, referred to as feature gold
Word tower network (FPN), for backbone network (backbone) typical case, based on using ResNet as the big depth model of representative, together
Shi Ronghe Analysis On Multi-scale Features detect target, extraordinary detection performance are achieved, relative to traditional algorithm in small target deteection etc.
Aspect embodies advantage.But due to using this larger backbone networks of similar ResNet, if in edge side platform part
It affixes one's name to FPN and then there are many limitations such as calculating, storage.The present invention carries out knowledge distillation for FPN, in the target detection for maintaining FPN
While performance, its network size and computational load are reduced, to be conducive to it in edge side Platform deployment.
The background of FPN:
The case where Fig. 1 is typical target detection network such as Faster R-CNN, input picture from its backbone network (such as
ResNet first layer) enters, and data carry out the relevant prediction of target detection after being propagated forward to last layer always, therefore is a kind of
Structure under single scale, and not comprising the multiple dimensioned design of pyramid.
Fig. 2 is the skeleton diagram of FPN, it is seen that data are in addition to common backbone bottom-up, from first layer to last layer
Within propagation other than, also top-down connection, and horizontal connection is (general to generate additional feature pyramid
Sketch map right half part).
Present to Fig. 3 details the horizontal connection between feature pyramidal layer and top-down connection, it is seen that FPN is in backbone
Additional 1x1 convolutional layers, up-sampling and Eltwise layer are increased except net.
Since FPN can just be produced general in the pyramidal each layer of progress relevant prediction of target detection of feature
The advantage of multiscale target detection not available for logical target detection network.
For common target detection network, Chen etc. is in document Learning Efficient Object
A kind of method of knowledge distillation is proposed in Detection Models with Knowledge Distillation.Wherein,
In view of the difference of object detection task and basic classification task, the loss letter needed for trained network is pointedly devised
The feature adaptation layer being attached to over the backbone when counting, and training between teacher's network and student network.
But FPN detects network as a kind of special objective of Multi-scale model, is not particularly suited for directly using the side of the document
Method carries out knowledge distillation.For example, the feature adaptation layer of Chen documents design, is directly bridge joint teacher's network and over the backbone
Raw network carries out the intermediate features adaptation between them.Due to its design, for Detection task feature also really in backbone
It generates on the net, therefore is suitable.However FPN is but and indirect using the feature exported in backbone network, but in backbone network
Except increase additional additional structure, and what the feature for applying to Detection task exactly exported in these structures, therefore
If the above method is equally used for FPN just and is not suitable for.
For another example, only there are one feature adaptation layers for literature method, because it is based only upon the task of single scale, therefore can not be thick
It is non-.But FPN is different, it is that multiple features outputs are used for multiple dimensioned task, so if feature adaptation layer there are one only, it may not
It can ensure the matched well of Analysis On Multi-scale Features between teacher's network and student network well.
Invention content
The shortcomings that in order to overcome the prior art, the present invention provides a kind of convolutional neural networks optimizations of knowledge based distillation
Method, the present invention devise a kind of multiple dimensioned knowledge distillating method, so that it is preferably suitable for multiscale target and detect network
FPN。
The technical solution adopted in the present invention is:A kind of convolutional neural networks optimization method of knowledge based distillation, from FPN
Feature pyramid part additional structure in chosen position establish bridge joint;Between teacher FPN networks T and student's FPN networks S
Bridge joint position establish multiple feature adaptation layers;It is used as the loss letter of network training using the multiple dimensioned loss function of level weighting
Number.
Compared with prior art, the positive effect of the present invention is:
On the one hand, the knowledge based on the present invention distills design, training of students FPN networks, energy under teacher's FPN guiding via networks
It is enough to compress complicated teacher's FPN networks, it obtains a scale smaller, calculate faster student FPN networks.Compared to existing
There is the target detection technique based on CNN directly to use FPN, is more convenient for carrying out the deployment of edge side calculating.
On the other hand consider the mode of knowledge distillation, compared with existing knowledge distillation technique, the present invention can be fitted preferably
Network FPN is detected with multiscale target, can preferably train student's FPN networks of high quality.
Description of the drawings
Examples of the present invention will be described by way of reference to the accompanying drawings, wherein:
The case where Fig. 1 is typical target detection network;
Fig. 2 is the skeleton diagram of FPN;
Fig. 3 is characterized horizontal connection and top-down connection diagram between pyramidal layer.
Specific implementation mode
The method of the present invention includes following content:
First, the present invention designs feature adaptation layer on the additional structure of backbone network, to be different from Chen documents in bone
Dry network portion carries out the feature adaptation between teacher's network and student network.
Secondly, what the present invention designed on backbone network additional structure is multiple (optional) feature adaptation layers, to be based on
They between teacher's network and student network, carry out and Analysis On Multi-scale Features pyramid be adapted, between multilayer feature in pairs
Characteristic matching.
Third, in the design aspect of loss function, the present invention considers the function of FPN multiscale targets detection, devises
A kind of multiple dimensioned loss function of multi-level loss weighting, to be different from Chen documents.
Specifically, main separation structure, loss function two large divisions discuss.
First in structure, the present invention take with feature adaptation method as Chen document categories, i.e., in teacher's network and
On bridge joint position between raw network feature adaptation is carried out with 1x1 convolutional layers.The effect of 1x1 convolution is mainly so that as input
Teacher's network middle layer characteristic pattern (feature map) port number, adaptation as export student network middle layer feature
The port number of figure.
But unlike Chen document essence, the present invention is not the chosen position foundation bridge joint from backbone network, but
Chosen position establishes bridge joint from the additional structure of the feature pyramid part of FPN.
Assuming that teacher's FPN network T additional structures output characteristic pattern quantity is Nt, and student's FPN network S additional structures export
Characteristic pattern quantity is Ns, then the present invention establishes n feature adaptation layer between T and S, wherein 1≤n≤N, N=min (Nt, Ns).
In other words, it is assumed that the additional structure output characteristic pattern of student network is less, then the quantitative range of the invention for establishing adaptation layer exists
Between [1, Ns];If the output characteristic pattern of teacher's network building-out structure on the contrary is less, the quantitative range of adaptation layer is in [1, Nt]
Between.
Assuming that Nt and Ns is 4, then the value range of adaptation layer is 1~4.
Using in FPN originals by ResNet as in case of backbone network, it is assumed that T is identical as S structures, if C2,
C3, C4, C5 } it is conv2, conv3, conv4 in ResNet, the output characteristic pattern of these convolutional layers of conv5, and FPN is additional
{ P2, P3, P4, P5 } in structure is respectively and { C2, C3, C4, C5 } an equal amount of output characteristic pattern on backbone network.If point
{ P2, P3, P4, P5 } on { P2, P3, P4, P5 } and S on T is not distinguished with suffix _ t and suffix _ s, then the present embodiment is based on
{ P2_t, P3_t, P4_t, P5_t } and { P2_s, P3_s, P4_s, P5_s } these characteristic patterns select the positioning of adaptation layer.
Such as take n=1, then the adaptation layer position that the present embodiment selects for set (P2_t, P2_s), (P3_t, P3_s),
(P4_t, P4_s), (P5_t, P5_s) } in any one.By taking (P2_t, P2_s) as an example, mean using the P2_t of T as adaptation
The input of layer, is output to after 1x1 convolution in S, and characteristic matching is carried out with the P2_s of S, and so on.Preferably, selection is suitable
It is (P3_t, P3_s) or (P4_t, P4_s) with layer position, i.e., in optional position, opposite centered position is selected to establish bridge joint.
Such as take n=4, then the position of 4 feature adaptation layers be (P2_t, P2_s), (P3_t, P3_s), (P4_t, P4_s),
(P5_t,P5_s)}。
If n takes medium value, such as n=2, then preferably, a position can be selected in { (P2_t, P2_s), (P3_t, P3_s) }
Vertical bridge joint is set up, a position is then selected to establish bridge joint again in { (P4_t, P4_s), (P5_t, P5_s) }.
Assuming that establishing multiple bridge joints between T and S, then any two of which bridges (Pi_t, Pk_s), (Pj_t, Pl_s)
Position relationship must meet constraint:(1) i is not equal to j, and k is not equal to 1;(2) if i>J then needs k>1.Each bridge is ensured in this way
The position connect is without coincidence, and there is no intersecting on characteristic dimension for each bridge joint position between T and S.
Second, in the design of the loss function for network training.The method of Chen documents is, in target detection network
In RPN (region recommendation network) partly loss item L with RCN (territorial classification and frame return) partRPNAnd LRCNRespectively:
Wherein λ is hyper parameter, and N is the batch sizes of RCN, and M is the batch sizes of RPN, Classification Loss LclsIt is to be based on
The combination of the soft loss of the softmax of ground truth labels losses firmly and knowledge based distillation, frame return loss Lreg
It is then the combination of smooth L1 losses and the L2 losses of teacher's network limit.
The present invention is in loss item LRPNAnd LRCNDefinition in terms of, used for reference Chen documents.And the total losses function L of the document
For:
L=LRPN+LRCN+γLhint
Wherein, γ is hyper parameter, LHintIt is aspect ratio between teacher's network and student network after feature adaptation to damage
It loses.As it can be seen that L is the summation of RPN losses, RCN losses and aspect ratio to loss.
The present invention is based on similar thoughts, are optimized further combined with the multiple dimensioned feature of feature, definition loss
Function:
WhereinIt is aspect ratio after n feature adaptation layer to the weighted sum of loss, γiFor hyper parameter, correspond to every
Aspect ratio is to loss after a feature adaptation layerWeight.
Wherein, Zi is the middle layer feature of input terminal, that is, teacher's network of current signature adaptation layer, after feature adaptation
Feature, Vi be current signature adaptation layer output end (i.e. the middle layer feature of its corresponding student network).
In the present invention, γiPresence can be used for control two kinds balance:First, aspect ratio is to loss after feature adaptation layer
The tradeoff of importance between being lost with other types;Second is that aspect ratio is to the tradeoff inside loss, i.e., each ruler after feature adaptation layer
Spend the balance of aspect ratio importance between loss.
In the training of network, γ can be passed throughiBe adjusted flexibly, so that loss function is preferably adapted to the inspection of specific target
Survey task.Preferably, it is assumed that γ2And γ3Respectively on corresponding position (P2_t, P2_s) and (P3_t, P3_s) after feature adaptation
Aspect ratio is to losing weight, such as when the small scaled target for needing to detect in specific tasks is relatively more, then can set γ2>γ3,
To reinforce optimizing small scaled target detection, if otherwise large scale target can more at most set γ2<γ3, and so on.
As described in FPN original papers, entirety is a kind of generic structure, therefore backbone network in FPN structures involved in the present invention
The part of network is equally also not limited to the ResNet of embodiment, can be other depth convolutional neural networks.
The main distinction of the present invention and the prior art is summarized as follows:
1, the position (additional structure of backbone network vs. backbone networks) of feature adaptation layer, than over the backbone closer in appoint
The feature of business actual use so that teacher's network can provide more effective supervisory signals (hint) for student network;
2, the quantity (single scale is adapted to the multiple dimensioned adaptations of vs., i.e., the n in given value range is a) of feature adaptation layer, more
It is bonded the feature pyramid of FPN marrow;
3, the design (the multiple dimensioned loss function of single scale loss function vs. levels weighting) of loss function, can be preferably
It controls the internal i.e. equilibrium of each dimensional losses item of Analysis On Multi-scale Features comparison loss and they loses the balanced of item with other types,
Etc..
Claims (8)
1. a kind of convolutional neural networks optimization method of knowledge based distillation, it is characterised in that:From the feature pyramid part of FPN
Additional structure in chosen position establish bridge joint;Bridge joint position between teacher FPN networks T and student's FPN networks S is established more
A feature adaptation layer;It is used as the loss function of network training using the multiple dimensioned loss function of level weighting.
2. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:?
On bridge joint position between teacher FPN networks T and student's FPN networks S feature adaptation is carried out with 1x1 convolutional layers.
3. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:If
Establish multiple bridge joints between teacher FPN networks T and student's FPN networks S, then any two of which bridge joint (Pi_t, Pk_s) and
The position relationship of (Pj_t, Pl_s) must meet following constraint:(1) i is not equal to j, and k is not equal to 1;(2) if i>J, then k>
1。
4. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:It is special
The number n of sign adaptation layer is determined in the following way:1≤n≤N and N=min (Nt, Ns), wherein:Nt indicates teacher's FPN networks
T additional structures export characteristic pattern quantity, and Ns indicates that student's FPN network S additional structures export characteristic pattern quantity.
5. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 1, it is characterised in that:Institute
Stating the multiple dimensioned loss function that level weights is:
Wherein:LRPNAnd LRCNIt is illustrated respectively in the loss item of the RPN part and the parts RCN in target detection network;For
Aspect ratio is to the weighted sum of loss, γ after n feature adaptation layeriFor hyper parameter, correspond to aspect ratio after each feature adaptation layer
To lossWeight.
6. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that:Institute
Aspect ratio is to loss after stating each feature adaptation layerIt is calculated as follows:
Wherein, ZiFor the input terminal of current signature adaptation layer, ViFor the output end of current signature adaptation layer.
7. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that:
LRPNAnd LRCNFollowing formula is respectively adopted to calculate:
Wherein:λ is hyper parameter, and N is the batch sizes of RCN, and M is the batch sizes of RPN, Classification Loss LclsIt is to be based on
The combination of the soft loss of the softmax of ground truth labels losses firmly and knowledge based distillation, frame return loss Lreg
It is the combination of smooth L1 losses and the L2 losses of teacher's network limit.
8. a kind of convolutional neural networks optimization method of knowledge based distillation according to claim 5, it is characterised in that:When
When needing the small scaled target detected relatively more in specific tasks, then γ is seti>γi+1If otherwise large scale target is more
Then set γi<γi+1。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810530304.7A CN108764462A (en) | 2018-05-29 | 2018-05-29 | A kind of convolutional neural networks optimization method of knowledge based distillation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810530304.7A CN108764462A (en) | 2018-05-29 | 2018-05-29 | A kind of convolutional neural networks optimization method of knowledge based distillation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108764462A true CN108764462A (en) | 2018-11-06 |
Family
ID=64003296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810530304.7A Pending CN108764462A (en) | 2018-05-29 | 2018-05-29 | A kind of convolutional neural networks optimization method of knowledge based distillation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108764462A (en) |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635111A (en) * | 2018-12-04 | 2019-04-16 | 国网江西省电力有限公司信息通信分公司 | A kind of news click bait detection method based on network migration |
CN109816636A (en) * | 2018-12-28 | 2019-05-28 | 汕头大学 | A kind of crack detection method based on intelligent terminal |
CN109886343A (en) * | 2019-02-26 | 2019-06-14 | 深圳市商汤科技有限公司 | Image classification method and device, equipment, storage medium |
CN109919110A (en) * | 2019-03-13 | 2019-06-21 | 北京航空航天大学 | Video area-of-interest-detection method, device and equipment |
CN110059717A (en) * | 2019-03-13 | 2019-07-26 | 山东大学 | Convolutional neural networks automatic division method and system for breast molybdenum target data set |
CN110120036A (en) * | 2019-04-17 | 2019-08-13 | 杭州数据点金科技有限公司 | A kind of multiple dimensioned tire X-ray defect detection method |
CN110245754A (en) * | 2019-06-14 | 2019-09-17 | 西安邮电大学 | A kind of knowledge distillating method based on position sensing figure |
CN110263842A (en) * | 2019-06-17 | 2019-09-20 | 北京影谱科技股份有限公司 | For the neural network training method of target detection, device, equipment, medium |
CN110298227A (en) * | 2019-04-17 | 2019-10-01 | 南京航空航天大学 | A kind of vehicle checking method in unmanned plane image based on deep learning |
CN111062951A (en) * | 2019-12-11 | 2020-04-24 | 华中科技大学 | Knowledge distillation method based on semantic segmentation intra-class feature difference |
CN111179212A (en) * | 2018-11-10 | 2020-05-19 | 杭州凝眸智能科技有限公司 | Method for realizing micro target detection chip integrating distillation strategy and deconvolution |
CN111178115A (en) * | 2018-11-12 | 2020-05-19 | 北京深醒科技有限公司 | Training method and system of object recognition network |
CN111260449A (en) * | 2020-02-17 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Model training method, commodity recommendation device and storage medium |
CN111275646A (en) * | 2020-01-20 | 2020-06-12 | 南开大学 | Edge-preserving image smoothing method based on deep learning knowledge distillation technology |
CN111312271A (en) * | 2020-02-28 | 2020-06-19 | 云知声智能科技股份有限公司 | Model compression method and system for improving convergence rate and processing performance |
WO2020143225A1 (en) * | 2019-01-08 | 2020-07-16 | 南京人工智能高等研究院有限公司 | Neural network training method and apparatus, and electronic device |
CN111428191A (en) * | 2020-03-12 | 2020-07-17 | 五邑大学 | Antenna downward inclination angle calculation method and device based on knowledge distillation and storage medium |
CN111461212A (en) * | 2020-03-31 | 2020-07-28 | 中国科学院计算技术研究所 | Compression method for point cloud target detection model |
CN111476167A (en) * | 2020-04-09 | 2020-07-31 | 北京中科千寻科技有限公司 | student-T distribution assistance-based one-stage direction remote sensing image target detection method |
CN111554268A (en) * | 2020-07-13 | 2020-08-18 | 腾讯科技(深圳)有限公司 | Language identification method based on language model, text classification method and device |
CN111626330A (en) * | 2020-04-23 | 2020-09-04 | 南京邮电大学 | Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation |
CN111967617A (en) * | 2020-08-14 | 2020-11-20 | 北京深境智能科技有限公司 | Machine learning method based on difficult sample learning and neural network fusion |
CN112020724A (en) * | 2019-04-01 | 2020-12-01 | 谷歌有限责任公司 | Learning compressible features |
CN112052945A (en) * | 2019-06-06 | 2020-12-08 | 北京地平线机器人技术研发有限公司 | Neural network training method, neural network training device and electronic equipment |
CN112150478A (en) * | 2020-08-31 | 2020-12-29 | 温州医科大学 | Method and system for constructing semi-supervised image segmentation framework |
CN112200062A (en) * | 2020-09-30 | 2021-01-08 | 广州云从人工智能技术有限公司 | Target detection method and device based on neural network, machine readable medium and equipment |
CN112487182A (en) * | 2019-09-12 | 2021-03-12 | 华为技术有限公司 | Training method of text processing model, and text processing method and device |
CN112508169A (en) * | 2020-11-13 | 2021-03-16 | 华为技术有限公司 | Knowledge distillation method and system |
CN112529178A (en) * | 2020-12-09 | 2021-03-19 | 中国科学院国家空间科学中心 | Knowledge distillation method and system suitable for detection model without preselection frame |
CN112560693A (en) * | 2020-12-17 | 2021-03-26 | 华中科技大学 | Highway foreign matter identification method and system based on deep learning target detection |
CN112560631A (en) * | 2020-12-09 | 2021-03-26 | 昆明理工大学 | Knowledge distillation-based pedestrian re-identification method |
CN112731852A (en) * | 2021-01-26 | 2021-04-30 | 南通大学 | Building energy consumption monitoring system based on edge calculation and monitoring method thereof |
CN113378866A (en) * | 2021-08-16 | 2021-09-10 | 深圳市爱深盈通信息技术有限公司 | Image classification method, system, storage medium and electronic device |
CN113470036A (en) * | 2021-09-02 | 2021-10-01 | 湖南大学 | Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation |
CN113486185A (en) * | 2021-09-07 | 2021-10-08 | 中建电子商务有限责任公司 | Knowledge distillation method based on joint training, processor and storage medium |
CN113505719A (en) * | 2021-07-21 | 2021-10-15 | 山东科技大学 | Gait recognition model compression system and method based on local-integral joint knowledge distillation algorithm |
CN113610146A (en) * | 2021-08-03 | 2021-11-05 | 江西鑫铂瑞科技有限公司 | Method for realizing image classification based on knowledge distillation enhanced by interlayer feature extraction |
CN113947590A (en) * | 2021-10-26 | 2022-01-18 | 四川大学 | Surface defect detection method based on multi-scale attention guidance and knowledge distillation |
CN114998570A (en) * | 2022-07-19 | 2022-09-02 | 上海闪马智能科技有限公司 | Method and device for determining object detection frame, storage medium and electronic device |
CN116612378A (en) * | 2023-05-22 | 2023-08-18 | 河南大学 | Unbalanced data and underwater small target detection method under complex background based on SSD improvement |
US12020425B2 (en) | 2021-12-03 | 2024-06-25 | Contemporary Amperex Technology Co., Limited | Fast anomaly detection method and system based on contrastive representation distillation |
CN118552739A (en) * | 2024-07-30 | 2024-08-27 | 山东航天电子技术研究所 | Image segmentation model compression method based on hardware perception |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090030897A1 (en) * | 2007-07-26 | 2009-01-29 | Hamid Hatami-Hanza | Assissted Knowledge Discovery and Publication System and Method |
US20140289323A1 (en) * | 2011-10-14 | 2014-09-25 | Cyber Ai Entertainment Inc. | Knowledge-information-processing server system having image recognition system |
CN106650756A (en) * | 2016-12-28 | 2017-05-10 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Image text description method based on knowledge transfer multi-modal recurrent neural network |
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107358293A (en) * | 2017-06-15 | 2017-11-17 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN108010030A (en) * | 2018-01-24 | 2018-05-08 | 福州大学 | A kind of Aerial Images insulator real-time detection method based on deep learning |
-
2018
- 2018-05-29 CN CN201810530304.7A patent/CN108764462A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090030897A1 (en) * | 2007-07-26 | 2009-01-29 | Hamid Hatami-Hanza | Assissted Knowledge Discovery and Publication System and Method |
US20140289323A1 (en) * | 2011-10-14 | 2014-09-25 | Cyber Ai Entertainment Inc. | Knowledge-information-processing server system having image recognition system |
CN106650756A (en) * | 2016-12-28 | 2017-05-10 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Image text description method based on knowledge transfer multi-modal recurrent neural network |
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107358293A (en) * | 2017-06-15 | 2017-11-17 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN108010030A (en) * | 2018-01-24 | 2018-05-08 | 福州大学 | A kind of Aerial Images insulator real-time detection method based on deep learning |
Non-Patent Citations (2)
Title |
---|
GUOBIN CHEN: "Learning Efficient Object Detection Models with Knowledge Distillation", 《NIPS"17:PROCEEDINGS OF THE 31ST INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 * |
施泽浩: "基于特征金字塔网络的目标检测算法", 《现代计算机》 * |
Cited By (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111179212A (en) * | 2018-11-10 | 2020-05-19 | 杭州凝眸智能科技有限公司 | Method for realizing micro target detection chip integrating distillation strategy and deconvolution |
CN111179212B (en) * | 2018-11-10 | 2023-05-23 | 杭州凝眸智能科技有限公司 | Method for realizing tiny target detection on-chip by integrating distillation strategy and deconvolution |
CN111178115B (en) * | 2018-11-12 | 2024-01-12 | 北京深醒科技有限公司 | Training method and system for object recognition network |
CN111178115A (en) * | 2018-11-12 | 2020-05-19 | 北京深醒科技有限公司 | Training method and system of object recognition network |
CN109635111A (en) * | 2018-12-04 | 2019-04-16 | 国网江西省电力有限公司信息通信分公司 | A kind of news click bait detection method based on network migration |
CN109816636A (en) * | 2018-12-28 | 2019-05-28 | 汕头大学 | A kind of crack detection method based on intelligent terminal |
CN109816636B (en) * | 2018-12-28 | 2020-11-27 | 汕头大学 | Crack detection method based on intelligent terminal |
WO2020143225A1 (en) * | 2019-01-08 | 2020-07-16 | 南京人工智能高等研究院有限公司 | Neural network training method and apparatus, and electronic device |
CN109886343B (en) * | 2019-02-26 | 2024-01-05 | 深圳市商汤科技有限公司 | Image classification method and device, equipment and storage medium |
CN109886343A (en) * | 2019-02-26 | 2019-06-14 | 深圳市商汤科技有限公司 | Image classification method and device, equipment, storage medium |
CN109919110B (en) * | 2019-03-13 | 2021-06-04 | 北京航空航天大学 | Video attention area detection method, device and equipment |
CN109919110A (en) * | 2019-03-13 | 2019-06-21 | 北京航空航天大学 | Video area-of-interest-detection method, device and equipment |
CN110059717A (en) * | 2019-03-13 | 2019-07-26 | 山东大学 | Convolutional neural networks automatic division method and system for breast molybdenum target data set |
CN112020724A (en) * | 2019-04-01 | 2020-12-01 | 谷歌有限责任公司 | Learning compressible features |
US12033077B2 (en) | 2019-04-01 | 2024-07-09 | Google Llc | Learning compressible features |
CN110298227B (en) * | 2019-04-17 | 2021-03-30 | 南京航空航天大学 | Vehicle detection method in unmanned aerial vehicle aerial image based on deep learning |
CN110298227A (en) * | 2019-04-17 | 2019-10-01 | 南京航空航天大学 | A kind of vehicle checking method in unmanned plane image based on deep learning |
CN110120036A (en) * | 2019-04-17 | 2019-08-13 | 杭州数据点金科技有限公司 | A kind of multiple dimensioned tire X-ray defect detection method |
CN112052945B (en) * | 2019-06-06 | 2024-04-16 | 北京地平线机器人技术研发有限公司 | Neural network training method, neural network training device and electronic equipment |
CN112052945A (en) * | 2019-06-06 | 2020-12-08 | 北京地平线机器人技术研发有限公司 | Neural network training method, neural network training device and electronic equipment |
CN110245754A (en) * | 2019-06-14 | 2019-09-17 | 西安邮电大学 | A kind of knowledge distillating method based on position sensing figure |
CN110245754B (en) * | 2019-06-14 | 2021-04-06 | 西安邮电大学 | Knowledge distillation guiding method based on position sensitive graph |
CN110263842A (en) * | 2019-06-17 | 2019-09-20 | 北京影谱科技股份有限公司 | For the neural network training method of target detection, device, equipment, medium |
CN110263842B (en) * | 2019-06-17 | 2022-04-05 | 北京影谱科技股份有限公司 | Neural network training method, apparatus, device, and medium for target detection |
CN112487182B (en) * | 2019-09-12 | 2024-04-12 | 华为技术有限公司 | Training method of text processing model, text processing method and device |
CN112487182A (en) * | 2019-09-12 | 2021-03-12 | 华为技术有限公司 | Training method of text processing model, and text processing method and device |
CN111062951B (en) * | 2019-12-11 | 2022-03-25 | 华中科技大学 | Knowledge distillation method based on semantic segmentation intra-class feature difference |
CN111062951A (en) * | 2019-12-11 | 2020-04-24 | 华中科技大学 | Knowledge distillation method based on semantic segmentation intra-class feature difference |
CN111275646A (en) * | 2020-01-20 | 2020-06-12 | 南开大学 | Edge-preserving image smoothing method based on deep learning knowledge distillation technology |
CN111260449B (en) * | 2020-02-17 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Model training method, commodity recommendation device and storage medium |
CN111260449A (en) * | 2020-02-17 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Model training method, commodity recommendation device and storage medium |
CN111312271A (en) * | 2020-02-28 | 2020-06-19 | 云知声智能科技股份有限公司 | Model compression method and system for improving convergence rate and processing performance |
CN111428191B (en) * | 2020-03-12 | 2023-06-16 | 五邑大学 | Antenna downtilt angle calculation method and device based on knowledge distillation and storage medium |
CN111428191A (en) * | 2020-03-12 | 2020-07-17 | 五邑大学 | Antenna downward inclination angle calculation method and device based on knowledge distillation and storage medium |
CN111461212A (en) * | 2020-03-31 | 2020-07-28 | 中国科学院计算技术研究所 | Compression method for point cloud target detection model |
CN111461212B (en) * | 2020-03-31 | 2023-04-07 | 中国科学院计算技术研究所 | Compression method for point cloud target detection model |
CN111476167B (en) * | 2020-04-09 | 2024-03-22 | 北京中科千寻科技有限公司 | One-stage direction remote sensing image target detection method based on student-T distribution assistance |
CN111476167A (en) * | 2020-04-09 | 2020-07-31 | 北京中科千寻科技有限公司 | student-T distribution assistance-based one-stage direction remote sensing image target detection method |
CN111626330A (en) * | 2020-04-23 | 2020-09-04 | 南京邮电大学 | Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation |
CN111554268B (en) * | 2020-07-13 | 2020-11-03 | 腾讯科技(深圳)有限公司 | Language identification method based on language model, text classification method and device |
CN111554268A (en) * | 2020-07-13 | 2020-08-18 | 腾讯科技(深圳)有限公司 | Language identification method based on language model, text classification method and device |
CN111967617A (en) * | 2020-08-14 | 2020-11-20 | 北京深境智能科技有限公司 | Machine learning method based on difficult sample learning and neural network fusion |
CN111967617B (en) * | 2020-08-14 | 2023-11-21 | 北京深境智能科技有限公司 | Machine learning method based on difficult sample learning and neural network fusion |
CN112150478A (en) * | 2020-08-31 | 2020-12-29 | 温州医科大学 | Method and system for constructing semi-supervised image segmentation framework |
CN112200062B (en) * | 2020-09-30 | 2021-09-28 | 广州云从人工智能技术有限公司 | Target detection method and device based on neural network, machine readable medium and equipment |
CN112200062A (en) * | 2020-09-30 | 2021-01-08 | 广州云从人工智能技术有限公司 | Target detection method and device based on neural network, machine readable medium and equipment |
CN112508169A (en) * | 2020-11-13 | 2021-03-16 | 华为技术有限公司 | Knowledge distillation method and system |
CN112529178B (en) * | 2020-12-09 | 2024-04-09 | 中国科学院国家空间科学中心 | Knowledge distillation method and system suitable for detection model without preselection frame |
CN112529178A (en) * | 2020-12-09 | 2021-03-19 | 中国科学院国家空间科学中心 | Knowledge distillation method and system suitable for detection model without preselection frame |
CN112560631A (en) * | 2020-12-09 | 2021-03-26 | 昆明理工大学 | Knowledge distillation-based pedestrian re-identification method |
CN112560693B (en) * | 2020-12-17 | 2022-06-17 | 华中科技大学 | Highway foreign matter identification method and system based on deep learning target detection |
CN112560693A (en) * | 2020-12-17 | 2021-03-26 | 华中科技大学 | Highway foreign matter identification method and system based on deep learning target detection |
CN112731852B (en) * | 2021-01-26 | 2022-03-22 | 南通大学 | Building energy consumption monitoring system based on edge calculation and monitoring method thereof |
CN112731852A (en) * | 2021-01-26 | 2021-04-30 | 南通大学 | Building energy consumption monitoring system based on edge calculation and monitoring method thereof |
CN113505719B (en) * | 2021-07-21 | 2023-11-24 | 山东科技大学 | Gait recognition model compression system and method based on local-integral combined knowledge distillation algorithm |
CN113505719A (en) * | 2021-07-21 | 2021-10-15 | 山东科技大学 | Gait recognition model compression system and method based on local-integral joint knowledge distillation algorithm |
CN113610146B (en) * | 2021-08-03 | 2023-08-04 | 江西鑫铂瑞科技有限公司 | Method for realizing image classification based on knowledge distillation with enhanced intermediate layer feature extraction |
CN113610146A (en) * | 2021-08-03 | 2021-11-05 | 江西鑫铂瑞科技有限公司 | Method for realizing image classification based on knowledge distillation enhanced by interlayer feature extraction |
CN113378866A (en) * | 2021-08-16 | 2021-09-10 | 深圳市爱深盈通信息技术有限公司 | Image classification method, system, storage medium and electronic device |
CN113470036A (en) * | 2021-09-02 | 2021-10-01 | 湖南大学 | Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation |
CN113470036B (en) * | 2021-09-02 | 2021-11-23 | 湖南大学 | Hyperspectral image unsupervised waveband selection method and system based on knowledge distillation |
CN113486185A (en) * | 2021-09-07 | 2021-10-08 | 中建电子商务有限责任公司 | Knowledge distillation method based on joint training, processor and storage medium |
CN113947590A (en) * | 2021-10-26 | 2022-01-18 | 四川大学 | Surface defect detection method based on multi-scale attention guidance and knowledge distillation |
US12020425B2 (en) | 2021-12-03 | 2024-06-25 | Contemporary Amperex Technology Co., Limited | Fast anomaly detection method and system based on contrastive representation distillation |
CN114998570A (en) * | 2022-07-19 | 2022-09-02 | 上海闪马智能科技有限公司 | Method and device for determining object detection frame, storage medium and electronic device |
CN116612378A (en) * | 2023-05-22 | 2023-08-18 | 河南大学 | Unbalanced data and underwater small target detection method under complex background based on SSD improvement |
CN116612378B (en) * | 2023-05-22 | 2024-07-05 | 河南大学 | Unbalanced data and underwater small target detection method under complex background based on SSD improvement |
CN118552739A (en) * | 2024-07-30 | 2024-08-27 | 山东航天电子技术研究所 | Image segmentation model compression method based on hardware perception |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108764462A (en) | A kind of convolutional neural networks optimization method of knowledge based distillation | |
CN111476294B (en) | Zero sample image identification method and system based on generation countermeasure network | |
CN108564029B (en) | Face attribute recognition method based on cascade multitask learning deep neural network | |
CN108805200B (en) | Optical remote sensing scene classification method and device based on depth twin residual error network | |
CN111488474B (en) | Fine-grained freehand sketch image retrieval method based on attention enhancement | |
Fang et al. | DART: Domain-adversarial residual-transfer networks for unsupervised cross-domain image classification | |
CN112651406B (en) | Depth perception and multi-mode automatic fusion RGB-D significance target detection method | |
CN110633708A (en) | Deep network significance detection method based on global model and local optimization | |
CN109241982A (en) | Object detection method based on depth layer convolutional neural networks | |
CN106570522B (en) | Object recognition model establishing method and object recognition method | |
CN110321859A (en) | A kind of optical remote sensing scene classification method based on the twin capsule network of depth | |
CN107292352A (en) | Image classification method and device based on convolutional neural networks | |
CN112365514A (en) | Semantic segmentation method based on improved PSPNet | |
Li et al. | A review of deep learning methods for pixel-level crack detection | |
CN112801138B (en) | Multi-person gesture estimation method based on human body topological structure alignment | |
Wang et al. | Hyperspectral image classification via deep network with attention mechanism and multigroup strategy | |
CN116012722A (en) | Remote sensing image scene classification method | |
CN115223017B (en) | Multi-scale feature fusion bridge detection method based on depth separable convolution | |
Hu et al. | Supervised multi-scale attention-guided ship detection in optical remote sensing images | |
CN111815680A (en) | Deep convolutional neural network automatic horizon tracking method based on constant fast mapping | |
CN115527098A (en) | Infrared small target detection method based on global mean contrast space attention | |
CN118230175A (en) | Real estate mapping data processing method and system based on artificial intelligence | |
Wang et al. | Underground defects detection based on GPR by fusing simple linear iterative clustering phash (SLIC-phash) and convolutional block attention module (CBAM)-YOLOv8 | |
CN117671364A (en) | Model processing method and device for image recognition, electronic equipment and storage medium | |
Zhang et al. | A small target detection algorithm based on improved YOLOv5 in aerial image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20220701 |