CN106355248A - Deep convolution neural network training method and device - Google Patents

Deep convolution neural network training method and device Download PDF

Info

Publication number
CN106355248A
CN106355248A CN201610738135.7A CN201610738135A CN106355248A CN 106355248 A CN106355248 A CN 106355248A CN 201610738135 A CN201610738135 A CN 201610738135A CN 106355248 A CN106355248 A CN 106355248A
Authority
CN
China
Prior art keywords
model
dcnn
training
beta pruning
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610738135.7A
Other languages
Chinese (zh)
Inventor
乔宇
刘家铭
王亚立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201610738135.7A priority Critical patent/CN106355248A/en
Publication of CN106355248A publication Critical patent/CN106355248A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The present invention relates to the field of deep learning techniques, in particular to a deep convolution neural network training method and a device. The deep convolution neural network training method and the device comprise the steps of a, pretraining the DCNN on a large scale data set, and pruning the DCNN; b, performing the migration learning on the pruned DCNN; c, performing the model compression and the pruning on the migrated DCNN with the small-scale target data set, In the process of migrating learning of large-scale source data set to small-scale target data set, the model compression and the pruning are performed on the DCNN by the migration learning method and the advantages of model compression technology, so as to improve the migration learning ability to reduce the risk of overfitting and the deployment difficulty on the small-scale target data set and improve the prediction ability of the model on the target data set.

Description

A kind of depth convolutional neural networks training method and device
Technical field
The present invention relates to depth learning technology field, particularly to a kind of depth convolutional neural networks training method and dress Put.
Background technology
In recent years, with the fast development of the Internet and computer technology, depth convolutional neural networks (deep Convolutional neural network, dcnn) achieved in the challenge subjects such as image classification, audio identification prominent Broken property success.But, the model structure bulky complex of dcnn, need large-scale data to be optimized training to model parameter.So And, real-life many practical problems, generally only have the support of small-scale data, directly utilize the small-scale of goal task Training data, is difficult to obtain high performance dcnn.One wide variety of strategy is transfer learning, in deep learning research neck Domain, transfer learning is a kind of effective technology for the modeling of small-scale target data set.Substantial amounts of research shows, from extensive source The dcnn that data set training obtains has general expression, can be as the pre-training model of small-scale target data set [donahue et al.,2014;yosinski et al.,2014].I.e. first with the large-scale dataset (source of common tasks Data set) training one huge structure dcnn, then using goal task small-scale data set (target data set) to pre- The dcnn of training is finely adjusted and obtains the dcnn with regard to goal task.But, the dcnn being obtained by set of source data pre-training Comprise a large amount of model redundancies, after this meeting ability that largely restricted migration learns, and then impact transfer learning, dcnn is in mesh Estimated performance in mark task.
In order to reduce redundancy, scientific research personnel propose dcnn is compressed with beta pruning [hintonet al, 2015;hanet al.,2015;han et al.,2016].Wherein, [han et al., 2015;Han et al., 2016] propose one kind to be directed to The iteration Pruning strategy of dcnn model parameter is it is achieved that considerable redundancy compression ratio.In addition, [hinton et al., 2015] will " knowledge " in extensive dcnn refines to small-scale dcnn, thus effectively instructing the training of small-scale dcnn, indirectly real with this The now model compression to extensive dcnn.But, from the angle of model compression, the beta pruning Compression Strategies [hinton of dcnn et al,2015;han et al.,2015;Han et al., 2016] it is primarily directed to the behaviour of same extensive set of source data Make, and be not directed to the transfer learning of small-scale target data set.Therefore, such method still suffers from how being small-scale target data The problem of high-performance dcnn set up by collection.
In above-mentioned, the list of references related to the application includes:
yosinski,j.,clune,j.,bengio,y.,lipson,h.(2014).how transferable are features in deep neural networks?.in advances in neural information processing systems.
donahue,j.,jia,y.,vinyals,o.,hoffman,j.,zhang,n.,tzeng,e.,darrell,t. (2014).decaf:a deep convolutional activation feature for generic visual recognition.in international conference on machine learning.
han,s.,pool,j.,tran,j.,dally,w.(2015).learning both weights and connections for efficient neural network.in advances in neural information processing systems.
hinton,g.,vinyals,o.,dean,j.(2015).distilling the knowledge in a neural network.arxiv preprint arxiv:1503.02531.
krizhevsky,a.,sutskever,i.,hinton,g.e.(2012).imagenet classification with deep convolutional neural networks.in advances in neural information processing systems.
han,s.,mao,h.,dally,w.j.(2016).deep compression:compressing deep neural network with pruning,trained quantization and huffman coding.in international conference on learning representations.
Content of the invention
The invention provides a kind of depth convolutional neural networks training method and device are it is intended at least solve to a certain extent Certainly one of above-mentioned technical problem of the prior art.
In order to solve the above problems, the technical scheme is that
A kind of depth convolutional neural networks training method, comprises the following steps:
Step a: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to described dcnn;
Step b: transfer learning is carried out on the dcnn of beta pruning;
Step c: carry out model compression using the dcnn after the migration of small-scale target data set pair.
The technical scheme that the embodiment of the present invention is taken also includes: in described step a, described carries out pre-training tool to dcnn Body is: using extensive set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to described dcnn;Institute State and model beta pruning is carried out to dcnn particularly as follows: carrying out model beta pruning using the iterative strategy of beta pruning-retraining, each iteration is divided into Two steps, the first step is model beta pruning, and Model Weight parameter relatively low for significance in this dcnn is set to zero;Second step is model Retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains rarefaction dcnn.
The technical scheme that the embodiment of the present invention is taken also includes: in described step b, the described dcnn in beta pruning is enterprising Row transfer learning specifically includes:
Step b1: the output layer changing described rarefaction dcnn is the classification of target data set, leans on by output layer and most The full articulamentum of nearly output layer reverts to densification, and the Model Weight parameter of the full articulamentum near output layer is carried out at random Initialization;
Step b2: refine the tacit knowledge with regard to target data set in set of source data, dominant using small-scale data set Knowledge and its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realizes transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: in described step b, the described dcnn in beta pruning is enterprising Row transfer learning specifically also includes:
Step b3: using amended rarefaction dcnn as trunk model;
Step b4: using pre-training dcnn in set of source data as tacit knowledge reference model;
Step b5: carry out by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer Replicate, as the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model;
Step b6: the prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Profit Compared with the corresponding output of tacit knowledge reference model with the prediction of additional branches, design extraneoas loss function;Total losses function It is the weighted sum of main loss function and extraneoas loss function;Using described total losses function, to trunk mould on target data set Type and additional branches carry out model training using back-propagation algorithm, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: in described step c, described to migration after dcnn carry out Model compression specifically includes: first by the iterative strategy of beta pruning-retraining, carries out beta pruning in described trunk model, uses The connection of non-zero setting in total losses function pair trunk model and extra branch carry out parameter learning;Then instruct from target at random Practice the input concentrating sample drawn subset as trunk model, obtain activation on full articulamentum for the described sample drawn subset Value, cuts the relatively low neuron of significance, and carries out retraining using total losses function, repeatedly complete model pressure with this iteration Contracting.
Another technical scheme that the embodiment of the present invention is taken is: a kind of depth convolutional neural networks training devicess, comprising:
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and described dcnn is entered Row model beta pruning;
Transfer learning module: for transfer learning is carried out on the dcnn of beta pruning;
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair.
The technical scheme that the embodiment of the present invention is taken also includes: described model pre-training module carries out pre-training tool to dcnn Body is: using extensive set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to described dcnn;Institute State and model beta pruning is carried out to dcnn particularly as follows: carrying out model beta pruning using the iterative strategy of beta pruning-retraining, each iteration is divided into Two steps, the first step is model beta pruning, and Model Weight parameter relatively low for significance in this dcnn is set to zero;Second step is model Retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains rarefaction dcnn.
The technical scheme that the embodiment of the present invention is taken also includes: described transfer learning module includes:
Model modification unit: the output layer for changing described rarefaction dcnn is the classification of target data set, will export Layer and the full articulamentum near output layer revert to densification, and the Model Weight ginseng to the full articulamentum near output layer Number carries out random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale number According to Explicit Knowledge and its tacit knowledge in source data set of collection, the dcnn of fine setting training rarefaction, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: described transfer learning module is carried out on the dcnn of beta pruning Transfer learning specifically also includes: using amended rarefaction dcnn as trunk model;By pre-training dcnn in set of source data As tacit knowledge reference model;By the output layer in described tacit knowledge reference model and the full connection near output layer Layer is replicated, and as the additional branches of described trunk model, described additional branches is placed on the respective layer of trunk model; Prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Pre- using additional branches Survey and compare with the corresponding output of tacit knowledge reference model, design extraneoas loss function;Total losses function be main loss function with The weighted sum of extraneoas loss function;Using described total losses function, target data set makes to trunk model and additional branches Carry out model training with back-propagation algorithm, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: described model compression module carries out mould to the dcnn after migration Type compression specifically includes: first by the iterative strategy of beta pruning-retraining, carries out beta pruning, using total in described trunk model Loss function carries out parameter learning to the connection of the non-zero setting in trunk model and extra branch;Then train from target at random Concentrate sample drawn subset as the input of trunk model, obtain activation value on full articulamentum for the described sample drawn subset, Cut the relatively low neuron of significance, and carry out retraining using total losses function, model compression is repeatedly completed with this iteration.
With respect to prior art, what the embodiment of the present invention produced has the beneficial effects that: the depth convolution of the embodiment of the present invention Neural network training method and device utilize the mutual supplement with each other's advantages of transfer learning method and model compression technology, in extensive source data During collecting the transfer learning of small-scale target data set, model compression and beta pruning are carried out to dcnn, thus improve migration learning Habit ability, to reduce over-fitting risk on small-scale target data set for the dcnn and deployment difficulty, improves model in number of targets According to the predictive ability on collection.The compression dcnn being obtained by the present invention, is applicable to mobile terminal, embedded equipment, robot In the high technology industry field calculating with constrained storage, there is higher economical and practical value.
Brief description
Fig. 1 is the flow chart of the depth convolutional neural networks training method of the embodiment of the present invention;
Fig. 2 is the structural representation of the depth convolutional neural networks training devicess of the embodiment of the present invention.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with drawings and Examples, right The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only in order to explain the present invention, not For limiting the present invention.
The depth convolutional neural networks training method of the embodiment of the present invention and device utilize transfer learning method and model pressure The mutual supplement with each other's advantages of contracting technology, during extensive set of source data to the transfer learning of small-scale target data set, enters to dcnn Row model compression and beta pruning, thus improving transfer learning ability, to reduce over-fitting on small-scale target data set for the dcnn Risk and deployment difficulty, improve its Forecasting recognition rate.
Specifically, refer to Fig. 1, be the flow chart of the depth convolutional neural networks training method of the embodiment of the present invention.This The depth convolutional neural networks training method of inventive embodiments comprises the following steps:
Step 100: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to this dcnn, obtain To rarefaction dcnn;
In step 100, pre-training being carried out to dcnn particularly as follows: using extensive set of source data, being calculated by back propagation Method and gradient descent method, carry out pre-training to a dcnn.Model beta pruning is carried out to dcnn particularly as follows: using beta pruning-retraining Iterative strategy carry out model beta pruning.Each iteration is divided into two steps, and the first step is model beta pruning, by significance in this dcnn (as absolute value) relatively low Model Weight parameter is set to zero.Thus, these parameters corresponding neutral net connection will no longer Dcnn works, network structure becomes sparse, and then reach model beta pruning effect.The second step of iteration is that model is instructed again Practice, the dcnn after beta pruning is trained using back-propagation algorithm and gradient descent method, enters just for the model parameter not being zeroed out Row training.By the process of such a iteration for several times, realize on the premise of dcnn classification performance is not affected, as many as possible The connection of network is deleted on ground, with rarefaction network, reduces model redundancy.
Step 200: the Explicit Knowledge of target data set and the tacit knowledge of set of source data are utilized on the dcnn of beta pruning Carry out transfer learning, dcnn is transferred to aiming field;
In step 200, transfer learning is carried out on the dcnn of beta pruning and specifically includes following steps:
Step 201: modification rarefaction dcnn output layer be target data set classification, and by output layer and near The full articulamentum of output layer reverts to densification, and the Model Weight parameter of the full articulamentum near output layer is carried out at random just Beginningization;
In step 201, need to carry out the fine and close number recovering with the full articulamentum reinitializing non-constant, it is The figure of merit changes because of the difference of the factors such as task, neural network structure.Carry out the full articulamentum of this operation in the embodiment of the present invention Number be preferably two-layer.
Step 202: refine the tacit knowledge with regard to target data set in set of source data, dominant using small-scale data set Knowledge and its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realizes transfer learning;
In step 202., in order to improve model prediction performance, the embodiment of the present invention outside target data set, by source number It is introduced into transfer learning according to concentrating the tacit knowledge with regard to target data set.Specifically, the embodiment of the present invention is to dcnn model Carry out following modification:
1st, using amended rarefaction dcnn as trunk model;Target data is inputted this trunk model, output layer meeting Export the prediction probability about target data.
2nd, using pre-training dcnn in set of source data as tacit knowledge reference model;Target data is inputted this with reference to mould Type, output layer can export soft labels and (add temperature parameter t) in softmax function, this soft labels corresponds to source data class Other information, it contains the tacit knowledge in source data with regard to target data.
3rd, replicated by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer, As the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model.This place is replicated Full articulamentum the number of plies should with step 201 in carry out the fine and close number of plies phase recovered with the full articulamentum reinitializing operation With.Target data is inputted and after trunk model, passes through this additional branches, its output layer can export tacit knowledge in relevant source data Soft prediction probability (in softmax function add temperature parameter t).
4th, compared with the correspondence markings of target training set using the prediction of trunk model, design main loss function;Using volume The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function, this extraneoas loss function It is mainly used in extracting the tacit knowledge with regard to target data in source data from reference model;Total losses function is main loss function Weighted sum with extraneoas loss function;Using described total losses function, to trunk model and additional branches on target data set Carry out model training using back-propagation algorithm, realize transfer learning.
Step 300: carry out model compression using the dcnn after the migration of small-scale target data set pair;
In step 300, after the dcnn of rarefaction transfers to aiming field by transfer learning, using small-scale number of targets It is compressed according to the dcnn after set pair migration, so that rarefaction dcnn generating reduces the redundancy on aiming field further, carry Predictive ability on target data set for the high model.Specifically, due to finally only needing to using trunk model to target detection collection It is predicted assessing, so the embodiment of the present invention carries out model compression just for trunk model.In Compression Strategies and step 100 The iterative strategy of beta pruning-retraining is similar to, but in each iteration, beta pruning is only carried out in trunk model, and retraining is then Connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parameter learning.
After completing beta pruning-retraining, cut the partial nerve unit of full articulamentum in trunk model, thus compressing further Scale of model.Specifically, compress mode includes: randomly from target training set sample drawn subset as trunk model Input, obtains activation value on certain full articulamentum for these sample drawns with this.For this full articulamentum, cut notable first Property (as average activation value) relatively low neuron, then carries out retraining using total losses function, repeatedly completes mould with this iteration Type compresses.
Refer to Fig. 2, be the structural representation of the depth convolutional neural networks training devicess of the embodiment of the present invention.The present invention The depth convolutional neural networks training devicess of embodiment include model pre-training module, transfer learning module and model compression mould Block.
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and this dcnn is carried out Model beta pruning, obtains rarefaction dcnn;Wherein, model pre-training module carries out pre-training to dcnn particularly as follows: using extensive Set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to dcnn.Model pre-training module is entered to dcnn Row model beta pruning is particularly as follows: carry out model beta pruning using the iterative strategy of beta pruning-retraining.Each iteration is divided into two steps, and first Step is model beta pruning, and Model Weight parameter relatively low for significance (as absolute value) in this dcnn is set to zero.Thus, this A little corresponding neutral nets of parameter connect and will no longer work in dcnn, and then reach model beta pruning effect.The second of iteration Step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, that is, just for not being zeroed out Model parameter be trained.By the process of such a iteration for several times, realize before dcnn classification performance is not affected Put, delete the connection of network as much as possible, with rarefaction network, reduce model redundancy.
Transfer learning module: for being entered using the tacit knowledge of target data set and set of source data on the dcnn of beta pruning Row transfer learning, dcnn is transferred to aiming field;Specifically, transfer learning module includes model modification unit and model fine setting is single Unit;
Model modification unit: the output layer for changing rarefaction dcnn is the classification of target data set, and by output layer And revert to densification, and the Model Weight parameter to the full articulamentum near output layer near the full articulamentum of output layer Carry out random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale number According to Explicit Knowledge and its tacit knowledge in source data set of collection, the dcnn of fine setting training rarefaction, realize transfer learning.
In embodiments of the present invention, in order to improve model prediction performance, the embodiment of the present invention, will outside target data set Source data set is introduced into transfer learning with regard to the tacit knowledge of target data set.Specifically, the embodiment of the present invention is to dcnn Model carries out following modification:
1st, using amended rarefaction dcnn as trunk model;Target data is inputted this trunk model, output layer meeting Export the prediction probability about target data.
2nd, using pre-training dcnn in set of source data as tacit knowledge reference model;Target data is inputted this with reference to mould Type, output layer can export soft labels and (add temperature parameter t) in softmax function, this soft labels corresponds to source data class Other information, it contains the tacit knowledge in source data with regard to target data.
3rd, replicated by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer, As the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model.By target data Pass through this additional branches, its output layer can export the soft prediction probability about tacit knowledge in source data after input trunk model (in softmax function, add temperature parameter t).
4th, compared with the correspondence markings of target training set using the prediction of trunk model, design main loss function;Using volume The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function, this extraneoas loss function It is mainly used in extracting the tacit knowledge with regard to target data in source data from reference model;Total losses function is main loss function Weighted sum with extraneoas loss function;Using described total losses function, to trunk model and additional branches on target data set Carry out model training using back-propagation algorithm, realize transfer learning.
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair;Wherein, After the dcnn of rarefaction transfers to aiming field by transfer learning, entered using the dcnn after the migration of small-scale target data set pair Row compression, so that rarefaction dcnn generating reduces the redundancy on aiming field further, improves model on target data set Predictive ability.Specifically, due to finally only needing to target detection collection is predicted assess using trunk model, so this Bright embodiment carries out model compression just for trunk model.The iterative strategy of the beta pruning-retraining in Compression Strategies and step 100 Similar, but in each iteration, beta pruning is only carried out in trunk model, and retraining is then using total losses function pair trunk mould The connection of non-zero setting in type and extra branch carry out parameter learning.
After completing beta pruning-retraining, cut the partial nerve unit of full articulamentum in trunk model, thus compressing further Scale of model.Specifically, compress mode includes: randomly from target training set sample drawn subset as trunk model Input, obtains activation value on certain full articulamentum for these sample drawns with this.For this full articulamentum, cut notable first Property (as average activation value) relatively low neuron, then carries out retraining using total losses function, repeatedly completes mould with this iteration Type compresses.
In order to prove that the present invention's is practical, we are tested using the scene Recognition task with extensive using value Card.In experiment, imagenet ilsvrc12 object image data collection (comprised million width images) is used as extensive source number According to collection, (comprise 15,620 width figures using mit indoor scene recognition database scene image data storehouse Picture) as small-scale target data set.In addition, selecting wide variety of alexnet model (5 layers of convolutional layer, 3 layers of full articulamentum) [krizheyshky et al., 2012], is verified as dcnn model.With regard in the embodiment of the present invention, there is highest respectively Can be inquired into the model of highest compression ratio, result is as shown in Table 1.Each group is tested all using mit indoor scene The standard testing collection of recognition database carries out accuracy assessment.
The scene Recognition accuracy of table 1 each group experiment and compression ratio
As shown in Table 1, more traditional method for trimming, the present invention not only can in transfer learning significantly compression depth Neutral net, to reduce over-fitting risk on small-scale target data set for the dcnn and deployment difficulty, but also can improve The Forecasting recognition accuracy rate of dcnn after transfer learning.As can be seen here, the present invention be one practicable for small-scale data The high-performance depth convolutional neural networks training method of collection.
The depth convolutional neural networks training method of the embodiment of the present invention and device utilize transfer learning method and model pressure The mutual supplement with each other's advantages of contracting technology, during extensive set of source data to the transfer learning of small-scale target data set, enters to dcnn Row model compression and beta pruning, thus improving transfer learning ability, to reduce over-fitting on small-scale target data set for the dcnn Risk and deployment difficulty, improve predictive ability on target data set for the model.The compression dcnn being obtained by the present invention, can fit For the high technology industry field of the calculating such as mobile terminal, embedded equipment, robot and constrained storage, there is higher economy real With being worth.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention. Multiple modifications to these embodiments will be apparent from for those skilled in the art, as defined herein General Principle can be realized without departing from the spirit or scope of the present invention in other embodiments.Therefore, the present invention It is not intended to be limited to the embodiments shown herein, and be to fit to and principles disclosed herein and features of novelty phase one The scope the widest causing.

Claims (10)

1. a kind of depth convolutional neural networks training method is it is characterised in that comprise the following steps:
Step a: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to described dcnn;
Step b: transfer learning is carried out on the dcnn of beta pruning;
Step c: carry out model compression using the dcnn after the migration of small-scale target data set pair.
2. depth convolutional neural networks training method according to claim 1 is it is characterised in that in described step a, institute State and pre-training is carried out to dcnn particularly as follows: using extensive set of source data, by back-propagation algorithm and gradient descent method, to institute State dcnn and carry out pre-training;Described model beta pruning is carried out to dcnn particularly as follows: carrying out mould using the iterative strategy of beta pruning-retraining Type beta pruning, each iteration is divided into two steps, and the first step is model beta pruning, by Model Weight parameter relatively low for significance in this dcnn It is set to zero;Second step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains dilute Thinization dcnn.
3. depth convolutional neural networks training method according to claim 2 is it is characterised in that in described step b, institute State and transfer learning is carried out on the dcnn of beta pruning specifically include:
Step b1: the output layer changing described rarefaction dcnn is the classification of target data set, by output layer and near defeated The full articulamentum going out layer reverts to densification, and the Model Weight parameter of the described full articulamentum near output layer is carried out at random Initialization;
Step b2: refine the tacit knowledge with regard to target data set in set of source data, using the Explicit Knowledge of small-scale data set And its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realize transfer learning.
4. depth convolutional neural networks training method according to claim 3 is it is characterised in that in described step b, institute State and transfer learning is carried out on the dcnn of beta pruning specifically also include:
Step b3: using amended rarefaction dcnn as trunk model;
Step b4: using pre-training dcnn in set of source data as tacit knowledge reference model;
Step b5: carry out again by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer System, as the additional branches of described trunk model, described additional branches is placed on the respective layer of trunk model;
Step b6: the prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Using volume The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function;Total losses function is main Loss function and the weighted sum of extraneoas loss function;Using described total losses function, on target data set to trunk model and Additional branches carry out model training using back-propagation algorithm, realize transfer learning.
5. depth convolutional neural networks training method according to claim 4 is it is characterised in that in described step c, institute State and the dcnn after migration is carried out by model compression specifically includes: first by the iterative strategy of beta pruning-retraining, in described trunk Carry out beta pruning, connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parametrics in model Practise;Then from target training set, sample drawn subset, as the input of trunk model, obtains described sample drawn subset at random Activation value on full articulamentum, is cut the relatively low neuron of significance, and carries out retraining using total losses function, changed with this In generation, repeatedly completes model compression.
6. a kind of depth convolutional neural networks training devicess are it is characterised in that include:
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and mould is carried out to described dcnn Type beta pruning;
Transfer learning module: for transfer learning is carried out on the dcnn of beta pruning;
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair.
7. depth convolutional neural networks training devicess according to claim 6 are it is characterised in that described model pre-training mould Block carries out pre-training to dcnn particularly as follows: using extensive set of source data, by back-propagation algorithm and gradient descent method, to institute State dcnn and carry out pre-training;Described model beta pruning is carried out to dcnn particularly as follows: carrying out mould using the iterative strategy of beta pruning-retraining Type beta pruning, each iteration is divided into two steps, and the first step is model beta pruning, by Model Weight parameter relatively low for significance in this dcnn It is set to zero;Second step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains dilute Thinization dcnn.
8. depth convolutional neural networks training devicess according to claim 7 are it is characterised in that described transfer learning module Including:
Model modification unit: for change described rarefaction dcnn output layer be target data set classification, by output layer with And revert to densification near the full articulamentum of output layer, and the Model Weight parameter of the full articulamentum near output layer is entered Row random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale data set Explicit Knowledge and its tacit knowledge in source data set, fine setting training rarefaction dcnn, realize transfer learning.
9. depth convolutional neural networks training devicess according to claim 8 are it is characterised in that described transfer learning module Transfer learning is carried out on the dcnn of beta pruning specifically also include: using amended rarefaction dcnn as trunk model;By source Pre-training dcnn on data set is as tacit knowledge reference model;By the output layer in described tacit knowledge reference model and Full articulamentum near output layer is replicated, and as the additional branches of described trunk model, described additional branches is placed in On the respective layer of trunk model;Prediction using trunk model is compared with the correspondence markings of target training set, designs main loss Function;Compared with the corresponding output of tacit knowledge reference model using the prediction of additional branches, design extraneoas loss function;Total damage Lose the weighted sum that function is main loss function and extraneoas loss function;Using described total losses function, right on target data set Trunk model and additional branches carry out model training using back-propagation algorithm, realize transfer learning.
10. depth convolutional neural networks training devicess according to claim 9 are it is characterised in that described model compression mould Block carries out model compression to the dcnn after migration and specifically includes: first by the iterative strategy of beta pruning-retraining, in described trunk Carry out beta pruning, connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parametrics in model Practise;Then from target training set, sample drawn subset, as the input of trunk model, obtains described sample drawn subset at random Activation value on full articulamentum, is cut the relatively low neuron of significance, and carries out retraining using total losses function, changed with this In generation, repeatedly completes model compression.
CN201610738135.7A 2016-08-26 2016-08-26 Deep convolution neural network training method and device Pending CN106355248A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610738135.7A CN106355248A (en) 2016-08-26 2016-08-26 Deep convolution neural network training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610738135.7A CN106355248A (en) 2016-08-26 2016-08-26 Deep convolution neural network training method and device

Publications (1)

Publication Number Publication Date
CN106355248A true CN106355248A (en) 2017-01-25

Family

ID=57855127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610738135.7A Pending CN106355248A (en) 2016-08-26 2016-08-26 Deep convolution neural network training method and device

Country Status (1)

Country Link
CN (1) CN106355248A (en)

Cited By (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107102644A (en) * 2017-06-22 2017-08-29 华南师范大学 The underwater robot method for controlling trajectory and control system learnt based on deeply
CN107239802A (en) * 2017-06-28 2017-10-10 广东工业大学 A kind of image classification method and device
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
CN107392241A (en) * 2017-07-17 2017-11-24 北京邮电大学 A kind of image object sorting technique that sampling XGBoost is arranged based on weighting
CN107480611A (en) * 2017-07-31 2017-12-15 浙江大学 A kind of crack identification method based on deep learning convolutional neural networks
CN107491790A (en) * 2017-08-25 2017-12-19 北京图森未来科技有限公司 A kind of neural network training method and device
CN108108662A (en) * 2017-11-24 2018-06-01 深圳市华尊科技股份有限公司 Deep neural network identification model and recognition methods
CN108230354A (en) * 2017-05-18 2018-06-29 深圳市商汤科技有限公司 Target following, network training method, device, electronic equipment and storage medium
CN108229682A (en) * 2018-02-07 2018-06-29 深圳市唯特视科技有限公司 A kind of image detection countercheck based on backpropagation attack
CN108334934A (en) * 2017-06-07 2018-07-27 北京深鉴智能科技有限公司 Convolutional neural networks compression method based on beta pruning and distillation
CN108446724A (en) * 2018-03-12 2018-08-24 江苏中天科技软件技术有限公司 A kind of fusion feature sorting technique
CN108459585A (en) * 2018-04-09 2018-08-28 东南大学 Power station fan method for diagnosing faults based on sparse locally embedding depth convolutional network
CN108573287A (en) * 2018-05-11 2018-09-25 浙江工业大学 A kind of training method of the image codec based on deep neural network
CN108596243A (en) * 2018-04-20 2018-09-28 西安电子科技大学 The eye movement for watching figure and condition random field attentively based on classification watches figure prediction technique attentively
CN108629288A (en) * 2018-04-09 2018-10-09 华中科技大学 A kind of gesture identification model training method, gesture identification method and system
CN108805258A (en) * 2018-05-23 2018-11-13 北京图森未来科技有限公司 A kind of neural network training method and its device, computer server
CN108876774A (en) * 2018-06-07 2018-11-23 浙江大学 A kind of people counting method based on convolutional neural networks
CN108960415A (en) * 2017-05-23 2018-12-07 上海寒武纪信息科技有限公司 Processing unit and processing system
CN109034385A (en) * 2017-06-12 2018-12-18 辉达公司 With the system and method for sparse data training neural network
CN109063835A (en) * 2018-07-11 2018-12-21 中国科学技术大学 The compression set and method of neural network
CN109272118A (en) * 2018-08-10 2019-01-25 北京达佳互联信息技术有限公司 Data training method, device, equipment and storage medium
CN109376615A (en) * 2018-09-29 2019-02-22 苏州科达科技股份有限公司 For promoting the method, apparatus and storage medium of deep learning neural network forecast performance
CN109472274A (en) * 2017-09-07 2019-03-15 富士通株式会社 The training device and method of deep learning disaggregated model
CN109492754A (en) * 2018-11-06 2019-03-19 深圳市友杰智新科技有限公司 One kind is based on deep neural network model compression and accelerated method
CN109522949A (en) * 2018-11-07 2019-03-26 北京交通大学 Model of Target Recognition method for building up and device
CN109615858A (en) * 2018-12-21 2019-04-12 深圳信路通智能技术有限公司 A kind of intelligent parking behavior judgment method based on deep learning
CN109635288A (en) * 2018-11-29 2019-04-16 东莞理工学院 A kind of resume abstracting method based on deep neural network
CN109685120A (en) * 2018-12-11 2019-04-26 中科恒运股份有限公司 Quick training method and terminal device of the disaggregated model under finite data
CN109725531A (en) * 2018-12-13 2019-05-07 中南大学 A kind of successive learning method based on gate making mechanism
CN109726045A (en) * 2017-10-27 2019-05-07 百度(美国)有限责任公司 System and method for the sparse recurrent neural network of block
CN109815864A (en) * 2019-01-11 2019-05-28 浙江工业大学 A kind of facial image age recognition methods based on transfer learning
WO2019100998A1 (en) * 2017-11-24 2019-05-31 腾讯科技(深圳)有限公司 Voice signal processing model training method, electronic device, and storage medium
WO2019106619A1 (en) * 2017-11-30 2019-06-06 International Business Machines Corporation Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression
CN109960581A (en) * 2017-12-26 2019-07-02 广东欧珀移动通信有限公司 Hardware resource configuration method, device, mobile terminal and storage medium
CN110008854A (en) * 2019-03-18 2019-07-12 中交第二公路勘察设计研究院有限公司 Unmanned plane image Highway Geological Disaster recognition methods based on pre-training DCNN
CN110008880A (en) * 2019-03-27 2019-07-12 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN110084365A (en) * 2019-03-13 2019-08-02 西安电子科技大学 A kind of service provider system and method based on deep learning
CN110096976A (en) * 2019-04-18 2019-08-06 中国人民解放军国防科技大学 Human behavior micro-Doppler classification method based on sparse migration network
CN110245587A (en) * 2019-05-29 2019-09-17 西安交通大学 A kind of remote sensing image object detection method based on Bayes's transfer learning
CN110348422A (en) * 2019-07-18 2019-10-18 北京地平线机器人技术研发有限公司 Image processing method, device, computer readable storage medium and electronic equipment
WO2019205604A1 (en) * 2018-04-25 2019-10-31 北京市商汤科技开发有限公司 Image processing method, training method, apparatus, device, medium and program
WO2019205391A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Apparatus and method for generating vehicle damage classification model, and computer readable storage medium
CN110580523A (en) * 2018-06-07 2019-12-17 清华大学 Error calibration method and device for analog neural network processor
CN110647977A (en) * 2019-08-26 2020-01-03 北京空间机电研究所 Method for optimizing Tiny-YOLO network for detecting ship target on satellite
CN110648531A (en) * 2019-09-19 2020-01-03 军事科学院系统工程研究院网络信息研究所 Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network
WO2020019102A1 (en) * 2018-07-23 2020-01-30 Intel Corporation Methods, systems, articles of manufacture and apparatus to train a neural network
CN110799996A (en) * 2017-06-30 2020-02-14 康蒂-特米克微电子有限公司 Knowledge transfer between different deep learning architectures
CN110858253A (en) * 2018-08-17 2020-03-03 第四范式(北京)技术有限公司 Method and system for executing machine learning under data privacy protection
CN110929839A (en) * 2018-09-20 2020-03-27 深圳市商汤科技有限公司 Method and apparatus for training neural network, electronic device, and computer storage medium
CN111091177A (en) * 2019-11-12 2020-05-01 腾讯科技(深圳)有限公司 Model compression method and device, electronic equipment and storage medium
CN111134662A (en) * 2020-02-17 2020-05-12 武汉大学 Electrocardio abnormal signal identification method and device based on transfer learning and confidence degree selection
CN111291841A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Image recognition model training method and device, computer equipment and storage medium
CN111310520A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Dish identification method, cash registering method, dish order prompting method and related device
TWI700647B (en) * 2018-09-11 2020-08-01 國立清華大學 Electronic apparatus and compression method for artificial neural network
CN109407654B (en) * 2018-12-20 2020-08-04 浙江大学 Industrial data nonlinear causal analysis method based on sparse deep neural network
CN111767996A (en) * 2018-02-27 2020-10-13 上海寒武纪信息科技有限公司 Integrated circuit chip device and related product
CN111931698A (en) * 2020-09-08 2020-11-13 平安国际智慧城市科技股份有限公司 Image deep learning network construction method and device based on small training set
CN112001477A (en) * 2020-06-19 2020-11-27 南京理工大学 Deep learning-based model optimization algorithm for target detection YOLOv3
CN112329931A (en) * 2021-01-04 2021-02-05 北京智源人工智能研究院 Countermeasure sample generation method and device based on proxy model
CN112819157A (en) * 2021-01-29 2021-05-18 商汤集团有限公司 Neural network training method and device and intelligent driving control method and device
CN113222976A (en) * 2021-05-31 2021-08-06 河海大学 Space-time image texture direction detection method and system based on DCNN and transfer learning
CN107832837B (en) * 2017-11-28 2021-09-28 南京大学 Convolutional neural network compression method and decompression method based on compressed sensing principle
CN113780535A (en) * 2021-09-27 2021-12-10 华中科技大学 Model training method and system applied to edge equipment
CN113837376A (en) * 2021-08-30 2021-12-24 厦门大学 Neural network pruning method based on dynamic coding convolution kernel fusion
US11244226B2 (en) 2017-06-12 2022-02-08 Nvidia Corporation Systems and methods for training neural networks with sparse data
US11347308B2 (en) 2019-07-26 2022-05-31 Samsung Electronics Co., Ltd. Method and apparatus with gaze tracking
WO2022116819A1 (en) * 2020-12-04 2022-06-09 北京有竹居网络技术有限公司 Model training method and apparatus, machine translation method and apparatus, and device and storage medium
WO2022127907A1 (en) * 2020-12-17 2022-06-23 Moffett Technologies Co., Limited System and method for domain specific neural network pruning

Cited By (110)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108230354B (en) * 2017-05-18 2022-05-10 深圳市商汤科技有限公司 Target tracking method, network training method, device, electronic equipment and storage medium
CN108230354A (en) * 2017-05-18 2018-06-29 深圳市商汤科技有限公司 Target following, network training method, device, electronic equipment and storage medium
CN108960415B (en) * 2017-05-23 2021-04-20 上海寒武纪信息科技有限公司 Processing apparatus and processing system
CN108960415A (en) * 2017-05-23 2018-12-07 上海寒武纪信息科技有限公司 Processing unit and processing system
WO2018223822A1 (en) * 2017-06-07 2018-12-13 北京深鉴智能科技有限公司 Pruning- and distillation-based convolutional neural network compression method
CN108334934A (en) * 2017-06-07 2018-07-27 北京深鉴智能科技有限公司 Convolutional neural networks compression method based on beta pruning and distillation
US11244226B2 (en) 2017-06-12 2022-02-08 Nvidia Corporation Systems and methods for training neural networks with sparse data
CN109034385A (en) * 2017-06-12 2018-12-18 辉达公司 With the system and method for sparse data training neural network
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
US11625594B2 (en) 2017-06-15 2023-04-11 Beijing Tusen Zhitu Technology Co., Ltd. Method and device for student training networks with teacher networks
CN107247989B (en) * 2017-06-15 2020-11-24 北京图森智途科技有限公司 Real-time computer vision processing method and device
CN107102644B (en) * 2017-06-22 2019-12-10 华南师范大学 Underwater robot track control method and control system based on deep reinforcement learning
CN107102644A (en) * 2017-06-22 2017-08-29 华南师范大学 The underwater robot method for controlling trajectory and control system learnt based on deeply
CN107239802A (en) * 2017-06-28 2017-10-10 广东工业大学 A kind of image classification method and device
CN107239802B (en) * 2017-06-28 2021-06-01 广东工业大学 Image classification method and device
CN110799996A (en) * 2017-06-30 2020-02-14 康蒂-特米克微电子有限公司 Knowledge transfer between different deep learning architectures
CN107392241A (en) * 2017-07-17 2017-11-24 北京邮电大学 A kind of image object sorting technique that sampling XGBoost is arranged based on weighting
CN107480611A (en) * 2017-07-31 2017-12-15 浙江大学 A kind of crack identification method based on deep learning convolutional neural networks
CN107480611B (en) * 2017-07-31 2020-06-30 浙江大学 Crack identification method based on deep learning convolutional neural network
CN107491790A (en) * 2017-08-25 2017-12-19 北京图森未来科技有限公司 A kind of neural network training method and device
CN109472274B (en) * 2017-09-07 2022-06-28 富士通株式会社 Training device and method for deep learning classification model
CN109472274A (en) * 2017-09-07 2019-03-15 富士通株式会社 The training device and method of deep learning disaggregated model
CN109726045A (en) * 2017-10-27 2019-05-07 百度(美国)有限责任公司 System and method for the sparse recurrent neural network of block
US11651223B2 (en) 2017-10-27 2023-05-16 Baidu Usa Llc Systems and methods for block-sparse recurrent neural networks
CN109726045B (en) * 2017-10-27 2023-07-25 百度(美国)有限责任公司 System and method for block sparse recurrent neural network
US11158304B2 (en) 2017-11-24 2021-10-26 Tencent Technology (Shenzhen) Company Limited Training method of speech signal processing model with shared layer, electronic device and storage medium
CN108108662A (en) * 2017-11-24 2018-06-01 深圳市华尊科技股份有限公司 Deep neural network identification model and recognition methods
WO2019100998A1 (en) * 2017-11-24 2019-05-31 腾讯科技(深圳)有限公司 Voice signal processing model training method, electronic device, and storage medium
CN107832837B (en) * 2017-11-28 2021-09-28 南京大学 Convolutional neural network compression method and decompression method based on compressed sensing principle
JP7300798B2 (en) 2017-11-30 2023-06-30 インターナショナル・ビジネス・マシーンズ・コーポレーション Systems, methods, computer programs, and computer readable storage media for compressing neural network data
JP2021504837A (en) * 2017-11-30 2021-02-15 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Fully connected / regression deep network compression through enhancing spatial locality to the weight matrix and providing frequency compression
WO2019106619A1 (en) * 2017-11-30 2019-06-06 International Business Machines Corporation Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression
CN111357019B (en) * 2017-11-30 2023-12-29 国际商业机器公司 Compressing fully connected/recursive layers of depth network(s) by implementing spatial locality on weight matrices and implementing frequency compression
CN111357019A (en) * 2017-11-30 2020-06-30 国际商业机器公司 Compressing fully connected/recursive layers of deep network(s) by enforcing spatial locality on weight matrices and implementing frequency compression
GB2582233A (en) * 2017-11-30 2020-09-16 Ibm Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression
CN109960581B (en) * 2017-12-26 2021-06-01 Oppo广东移动通信有限公司 Hardware resource allocation method and device, mobile terminal and storage medium
CN109960581A (en) * 2017-12-26 2019-07-02 广东欧珀移动通信有限公司 Hardware resource configuration method, device, mobile terminal and storage medium
CN108229682A (en) * 2018-02-07 2018-06-29 深圳市唯特视科技有限公司 A kind of image detection countercheck based on backpropagation attack
CN111767996A (en) * 2018-02-27 2020-10-13 上海寒武纪信息科技有限公司 Integrated circuit chip device and related product
CN111767996B (en) * 2018-02-27 2024-03-05 上海寒武纪信息科技有限公司 Integrated circuit chip device and related products
CN108446724B (en) * 2018-03-12 2020-06-16 江苏中天科技软件技术有限公司 Fusion feature classification method
CN108446724A (en) * 2018-03-12 2018-08-24 江苏中天科技软件技术有限公司 A kind of fusion feature sorting technique
CN108629288A (en) * 2018-04-09 2018-10-09 华中科技大学 A kind of gesture identification model training method, gesture identification method and system
CN108629288B (en) * 2018-04-09 2020-05-19 华中科技大学 Gesture recognition model training method, gesture recognition method and system
CN108459585A (en) * 2018-04-09 2018-08-28 东南大学 Power station fan method for diagnosing faults based on sparse locally embedding depth convolutional network
CN108596243B (en) * 2018-04-20 2021-09-10 西安电子科技大学 Eye movement gaze prediction method based on hierarchical gaze view and conditional random field
CN108596243A (en) * 2018-04-20 2018-09-28 西安电子科技大学 The eye movement for watching figure and condition random field attentively based on classification watches figure prediction technique attentively
WO2019205604A1 (en) * 2018-04-25 2019-10-31 北京市商汤科技开发有限公司 Image processing method, training method, apparatus, device, medium and program
US11334763B2 (en) 2018-04-25 2022-05-17 Beijing Sensetime Technology Development Co., Ltd. Image processing methods, training methods, apparatuses, devices, media, and programs
WO2019205391A1 (en) * 2018-04-26 2019-10-31 平安科技(深圳)有限公司 Apparatus and method for generating vehicle damage classification model, and computer readable storage medium
CN108573287A (en) * 2018-05-11 2018-09-25 浙江工业大学 A kind of training method of the image codec based on deep neural network
CN108573287B (en) * 2018-05-11 2021-10-29 浙江工业大学 Deep neural network-based image codec training method
CN108805258A (en) * 2018-05-23 2018-11-13 北京图森未来科技有限公司 A kind of neural network training method and its device, computer server
CN108876774A (en) * 2018-06-07 2018-11-23 浙江大学 A kind of people counting method based on convolutional neural networks
CN110580523A (en) * 2018-06-07 2019-12-17 清华大学 Error calibration method and device for analog neural network processor
CN109063835B (en) * 2018-07-11 2021-07-09 中国科学技术大学 Neural network compression device and method
CN109063835A (en) * 2018-07-11 2018-12-21 中国科学技术大学 The compression set and method of neural network
WO2020019102A1 (en) * 2018-07-23 2020-01-30 Intel Corporation Methods, systems, articles of manufacture and apparatus to train a neural network
CN109272118A (en) * 2018-08-10 2019-01-25 北京达佳互联信息技术有限公司 Data training method, device, equipment and storage medium
CN110858253A (en) * 2018-08-17 2020-03-03 第四范式(北京)技术有限公司 Method and system for executing machine learning under data privacy protection
TWI700647B (en) * 2018-09-11 2020-08-01 國立清華大學 Electronic apparatus and compression method for artificial neural network
US11270207B2 (en) 2018-09-11 2022-03-08 National Tsing Hua University Electronic apparatus and compression method for artificial neural network
CN110929839A (en) * 2018-09-20 2020-03-27 深圳市商汤科技有限公司 Method and apparatus for training neural network, electronic device, and computer storage medium
CN109376615A (en) * 2018-09-29 2019-02-22 苏州科达科技股份有限公司 For promoting the method, apparatus and storage medium of deep learning neural network forecast performance
CN109376615B (en) * 2018-09-29 2020-12-18 苏州科达科技股份有限公司 Method, device and storage medium for improving prediction performance of deep learning network
CN109492754A (en) * 2018-11-06 2019-03-19 深圳市友杰智新科技有限公司 One kind is based on deep neural network model compression and accelerated method
CN109522949A (en) * 2018-11-07 2019-03-26 北京交通大学 Model of Target Recognition method for building up and device
CN109522949B (en) * 2018-11-07 2021-01-26 北京交通大学 Target recognition model establishing method and device
CN109635288B (en) * 2018-11-29 2023-05-23 东莞理工学院 Resume extraction method based on deep neural network
CN109635288A (en) * 2018-11-29 2019-04-16 东莞理工学院 A kind of resume abstracting method based on deep neural network
CN111310520A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Dish identification method, cash registering method, dish order prompting method and related device
CN109685120A (en) * 2018-12-11 2019-04-26 中科恒运股份有限公司 Quick training method and terminal device of the disaggregated model under finite data
CN111310520B (en) * 2018-12-11 2023-11-21 阿里巴巴集团控股有限公司 Dish identification method, cashing method, dish ordering method and related devices
CN109725531B (en) * 2018-12-13 2021-09-21 中南大学 Continuous learning method based on door control mechanism
CN109725531A (en) * 2018-12-13 2019-05-07 中南大学 A kind of successive learning method based on gate making mechanism
CN109407654B (en) * 2018-12-20 2020-08-04 浙江大学 Industrial data nonlinear causal analysis method based on sparse deep neural network
CN109615858A (en) * 2018-12-21 2019-04-12 深圳信路通智能技术有限公司 A kind of intelligent parking behavior judgment method based on deep learning
CN109815864A (en) * 2019-01-11 2019-05-28 浙江工业大学 A kind of facial image age recognition methods based on transfer learning
CN109815864B (en) * 2019-01-11 2021-01-01 浙江工业大学 Facial image age identification method based on transfer learning
CN110084365A (en) * 2019-03-13 2019-08-02 西安电子科技大学 A kind of service provider system and method based on deep learning
CN110084365B (en) * 2019-03-13 2023-08-11 西安电子科技大学 Service providing system and method based on deep learning
CN110059717A (en) * 2019-03-13 2019-07-26 山东大学 Convolutional neural networks automatic division method and system for breast molybdenum target data set
CN110008854A (en) * 2019-03-18 2019-07-12 中交第二公路勘察设计研究院有限公司 Unmanned plane image Highway Geological Disaster recognition methods based on pre-training DCNN
CN110008854B (en) * 2019-03-18 2021-04-30 中交第二公路勘察设计研究院有限公司 Unmanned aerial vehicle image highway geological disaster identification method based on pre-training DCNN
CN110008880A (en) * 2019-03-27 2019-07-12 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110008880B (en) * 2019-03-27 2023-09-29 深圳前海微众银行股份有限公司 Model compression method and device
CN110096976A (en) * 2019-04-18 2019-08-06 中国人民解放军国防科技大学 Human behavior micro-Doppler classification method based on sparse migration network
CN110245587A (en) * 2019-05-29 2019-09-17 西安交通大学 A kind of remote sensing image object detection method based on Bayes's transfer learning
CN110348422B (en) * 2019-07-18 2021-11-09 北京地平线机器人技术研发有限公司 Image processing method, image processing device, computer-readable storage medium and electronic equipment
CN110348422A (en) * 2019-07-18 2019-10-18 北京地平线机器人技术研发有限公司 Image processing method, device, computer readable storage medium and electronic equipment
US11347308B2 (en) 2019-07-26 2022-05-31 Samsung Electronics Co., Ltd. Method and apparatus with gaze tracking
CN110647977A (en) * 2019-08-26 2020-01-03 北京空间机电研究所 Method for optimizing Tiny-YOLO network for detecting ship target on satellite
CN110648531A (en) * 2019-09-19 2020-01-03 军事科学院系统工程研究院网络信息研究所 Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network
CN110648531B (en) * 2019-09-19 2020-12-04 军事科学院系统工程研究院网络信息研究所 Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network
CN111091177A (en) * 2019-11-12 2020-05-01 腾讯科技(深圳)有限公司 Model compression method and device, electronic equipment and storage medium
CN111134662A (en) * 2020-02-17 2020-05-12 武汉大学 Electrocardio abnormal signal identification method and device based on transfer learning and confidence degree selection
CN111291841A (en) * 2020-05-13 2020-06-16 腾讯科技(深圳)有限公司 Image recognition model training method and device, computer equipment and storage medium
CN112001477A (en) * 2020-06-19 2020-11-27 南京理工大学 Deep learning-based model optimization algorithm for target detection YOLOv3
CN111931698A (en) * 2020-09-08 2020-11-13 平安国际智慧城市科技股份有限公司 Image deep learning network construction method and device based on small training set
WO2022116819A1 (en) * 2020-12-04 2022-06-09 北京有竹居网络技术有限公司 Model training method and apparatus, machine translation method and apparatus, and device and storage medium
WO2022127907A1 (en) * 2020-12-17 2022-06-23 Moffett Technologies Co., Limited System and method for domain specific neural network pruning
CN116438544A (en) * 2020-12-17 2023-07-14 墨芯国际有限公司 System and method for domain-specific neural network pruning
CN112329931A (en) * 2021-01-04 2021-02-05 北京智源人工智能研究院 Countermeasure sample generation method and device based on proxy model
CN112329931B (en) * 2021-01-04 2021-05-07 北京智源人工智能研究院 Countermeasure sample generation method and device based on proxy model
CN112819157A (en) * 2021-01-29 2021-05-18 商汤集团有限公司 Neural network training method and device and intelligent driving control method and device
CN113222976B (en) * 2021-05-31 2022-08-05 河海大学 Space-time image texture direction detection method and system based on DCNN and transfer learning
CN113222976A (en) * 2021-05-31 2021-08-06 河海大学 Space-time image texture direction detection method and system based on DCNN and transfer learning
CN113837376B (en) * 2021-08-30 2023-09-15 厦门大学 Neural network pruning method based on dynamic coding convolution kernel fusion
CN113837376A (en) * 2021-08-30 2021-12-24 厦门大学 Neural network pruning method based on dynamic coding convolution kernel fusion
CN113780535A (en) * 2021-09-27 2021-12-10 华中科技大学 Model training method and system applied to edge equipment

Similar Documents

Publication Publication Date Title
CN106355248A (en) Deep convolution neural network training method and device
CN109598269A (en) A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution
CN108921294A (en) A kind of gradual piece of knowledge distillating method accelerated for neural network
CN110222140A (en) A kind of cross-module state search method based on confrontation study and asymmetric Hash
CN109543502A (en) A kind of semantic segmentation method based on the multiple dimensioned neural network of depth
CN110378208B (en) Behavior identification method based on deep residual error network
CN105772407A (en) Waste classification robot based on image recognition technology
CN106203363A (en) Human skeleton motion sequence Activity recognition method
CN111709321B (en) Human behavior recognition method based on graph convolution neural network
CN110222717A (en) Image processing method and device
CN107657204A (en) The construction method and facial expression recognizing method and system of deep layer network model
CN110222634A (en) A kind of human posture recognition method based on convolutional neural networks
CN107145893A (en) A kind of image recognition algorithm and system based on convolution depth network
CN111709289B (en) Multitask deep learning model for improving human body analysis effect
CN107203752A (en) A kind of combined depth study and the face identification method of the norm constraint of feature two
CN109284741A (en) A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network
CN109871892A (en) A kind of robot vision cognitive system based on small sample metric learning
CN110321997A (en) High degree of parallelism computing platform, system and calculating implementation method
CN108564166A (en) Based on the semi-supervised feature learning method of the convolutional neural networks with symmetrical parallel link
CN112651360B (en) Skeleton action recognition method under small sample
CN109102475A (en) A kind of image rain removing method and device
CN113052254A (en) Multi-attention ghost residual fusion classification model and classification method thereof
CN113128424A (en) Attention mechanism-based graph convolution neural network action identification method
Zhang et al. Skip-attention encoder–decoder framework for human motion prediction
CN111612046B (en) Feature pyramid graph convolution neural network and application thereof in 3D point cloud classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170125

RJ01 Rejection of invention patent application after publication