CN106355248A - Deep convolution neural network training method and device - Google Patents
Deep convolution neural network training method and device Download PDFInfo
- Publication number
- CN106355248A CN106355248A CN201610738135.7A CN201610738135A CN106355248A CN 106355248 A CN106355248 A CN 106355248A CN 201610738135 A CN201610738135 A CN 201610738135A CN 106355248 A CN106355248 A CN 106355248A
- Authority
- CN
- China
- Prior art keywords
- model
- dcnn
- training
- beta pruning
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The present invention relates to the field of deep learning techniques, in particular to a deep convolution neural network training method and a device. The deep convolution neural network training method and the device comprise the steps of a, pretraining the DCNN on a large scale data set, and pruning the DCNN; b, performing the migration learning on the pruned DCNN; c, performing the model compression and the pruning on the migrated DCNN with the small-scale target data set, In the process of migrating learning of large-scale source data set to small-scale target data set, the model compression and the pruning are performed on the DCNN by the migration learning method and the advantages of model compression technology, so as to improve the migration learning ability to reduce the risk of overfitting and the deployment difficulty on the small-scale target data set and improve the prediction ability of the model on the target data set.
Description
Technical field
The present invention relates to depth learning technology field, particularly to a kind of depth convolutional neural networks training method and dress
Put.
Background technology
In recent years, with the fast development of the Internet and computer technology, depth convolutional neural networks (deep
Convolutional neural network, dcnn) achieved in the challenge subjects such as image classification, audio identification prominent
Broken property success.But, the model structure bulky complex of dcnn, need large-scale data to be optimized training to model parameter.So
And, real-life many practical problems, generally only have the support of small-scale data, directly utilize the small-scale of goal task
Training data, is difficult to obtain high performance dcnn.One wide variety of strategy is transfer learning, in deep learning research neck
Domain, transfer learning is a kind of effective technology for the modeling of small-scale target data set.Substantial amounts of research shows, from extensive source
The dcnn that data set training obtains has general expression, can be as the pre-training model of small-scale target data set
[donahue et al.,2014;yosinski et al.,2014].I.e. first with the large-scale dataset (source of common tasks
Data set) training one huge structure dcnn, then using goal task small-scale data set (target data set) to pre-
The dcnn of training is finely adjusted and obtains the dcnn with regard to goal task.But, the dcnn being obtained by set of source data pre-training
Comprise a large amount of model redundancies, after this meeting ability that largely restricted migration learns, and then impact transfer learning, dcnn is in mesh
Estimated performance in mark task.
In order to reduce redundancy, scientific research personnel propose dcnn is compressed with beta pruning [hintonet al, 2015;hanet
al.,2015;han et al.,2016].Wherein, [han et al., 2015;Han et al., 2016] propose one kind to be directed to
The iteration Pruning strategy of dcnn model parameter is it is achieved that considerable redundancy compression ratio.In addition, [hinton et al., 2015] will
" knowledge " in extensive dcnn refines to small-scale dcnn, thus effectively instructing the training of small-scale dcnn, indirectly real with this
The now model compression to extensive dcnn.But, from the angle of model compression, the beta pruning Compression Strategies [hinton of dcnn
et al,2015;han et al.,2015;Han et al., 2016] it is primarily directed to the behaviour of same extensive set of source data
Make, and be not directed to the transfer learning of small-scale target data set.Therefore, such method still suffers from how being small-scale target data
The problem of high-performance dcnn set up by collection.
In above-mentioned, the list of references related to the application includes:
yosinski,j.,clune,j.,bengio,y.,lipson,h.(2014).how transferable are
features in deep neural networks?.in advances in neural information
processing systems.
donahue,j.,jia,y.,vinyals,o.,hoffman,j.,zhang,n.,tzeng,e.,darrell,t.
(2014).decaf:a deep convolutional activation feature for generic visual
recognition.in international conference on machine learning.
han,s.,pool,j.,tran,j.,dally,w.(2015).learning both weights and
connections for efficient neural network.in advances in neural information
processing systems.
hinton,g.,vinyals,o.,dean,j.(2015).distilling the knowledge in a
neural network.arxiv preprint arxiv:1503.02531.
krizhevsky,a.,sutskever,i.,hinton,g.e.(2012).imagenet classification
with deep convolutional neural networks.in advances in neural information
processing systems.
han,s.,mao,h.,dally,w.j.(2016).deep compression:compressing deep
neural network with pruning,trained quantization and huffman coding.in
international conference on learning representations.
Content of the invention
The invention provides a kind of depth convolutional neural networks training method and device are it is intended at least solve to a certain extent
Certainly one of above-mentioned technical problem of the prior art.
In order to solve the above problems, the technical scheme is that
A kind of depth convolutional neural networks training method, comprises the following steps:
Step a: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to described dcnn;
Step b: transfer learning is carried out on the dcnn of beta pruning;
Step c: carry out model compression using the dcnn after the migration of small-scale target data set pair.
The technical scheme that the embodiment of the present invention is taken also includes: in described step a, described carries out pre-training tool to dcnn
Body is: using extensive set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to described dcnn;Institute
State and model beta pruning is carried out to dcnn particularly as follows: carrying out model beta pruning using the iterative strategy of beta pruning-retraining, each iteration is divided into
Two steps, the first step is model beta pruning, and Model Weight parameter relatively low for significance in this dcnn is set to zero;Second step is model
Retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains rarefaction dcnn.
The technical scheme that the embodiment of the present invention is taken also includes: in described step b, the described dcnn in beta pruning is enterprising
Row transfer learning specifically includes:
Step b1: the output layer changing described rarefaction dcnn is the classification of target data set, leans on by output layer and most
The full articulamentum of nearly output layer reverts to densification, and the Model Weight parameter of the full articulamentum near output layer is carried out at random
Initialization;
Step b2: refine the tacit knowledge with regard to target data set in set of source data, dominant using small-scale data set
Knowledge and its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realizes transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: in described step b, the described dcnn in beta pruning is enterprising
Row transfer learning specifically also includes:
Step b3: using amended rarefaction dcnn as trunk model;
Step b4: using pre-training dcnn in set of source data as tacit knowledge reference model;
Step b5: carry out by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer
Replicate, as the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model;
Step b6: the prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Profit
Compared with the corresponding output of tacit knowledge reference model with the prediction of additional branches, design extraneoas loss function;Total losses function
It is the weighted sum of main loss function and extraneoas loss function;Using described total losses function, to trunk mould on target data set
Type and additional branches carry out model training using back-propagation algorithm, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: in described step c, described to migration after dcnn carry out
Model compression specifically includes: first by the iterative strategy of beta pruning-retraining, carries out beta pruning in described trunk model, uses
The connection of non-zero setting in total losses function pair trunk model and extra branch carry out parameter learning;Then instruct from target at random
Practice the input concentrating sample drawn subset as trunk model, obtain activation on full articulamentum for the described sample drawn subset
Value, cuts the relatively low neuron of significance, and carries out retraining using total losses function, repeatedly complete model pressure with this iteration
Contracting.
Another technical scheme that the embodiment of the present invention is taken is: a kind of depth convolutional neural networks training devicess, comprising:
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and described dcnn is entered
Row model beta pruning;
Transfer learning module: for transfer learning is carried out on the dcnn of beta pruning;
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair.
The technical scheme that the embodiment of the present invention is taken also includes: described model pre-training module carries out pre-training tool to dcnn
Body is: using extensive set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to described dcnn;Institute
State and model beta pruning is carried out to dcnn particularly as follows: carrying out model beta pruning using the iterative strategy of beta pruning-retraining, each iteration is divided into
Two steps, the first step is model beta pruning, and Model Weight parameter relatively low for significance in this dcnn is set to zero;Second step is model
Retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains rarefaction dcnn.
The technical scheme that the embodiment of the present invention is taken also includes: described transfer learning module includes:
Model modification unit: the output layer for changing described rarefaction dcnn is the classification of target data set, will export
Layer and the full articulamentum near output layer revert to densification, and the Model Weight ginseng to the full articulamentum near output layer
Number carries out random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale number
According to Explicit Knowledge and its tacit knowledge in source data set of collection, the dcnn of fine setting training rarefaction, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: described transfer learning module is carried out on the dcnn of beta pruning
Transfer learning specifically also includes: using amended rarefaction dcnn as trunk model;By pre-training dcnn in set of source data
As tacit knowledge reference model;By the output layer in described tacit knowledge reference model and the full connection near output layer
Layer is replicated, and as the additional branches of described trunk model, described additional branches is placed on the respective layer of trunk model;
Prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Pre- using additional branches
Survey and compare with the corresponding output of tacit knowledge reference model, design extraneoas loss function;Total losses function be main loss function with
The weighted sum of extraneoas loss function;Using described total losses function, target data set makes to trunk model and additional branches
Carry out model training with back-propagation algorithm, realize transfer learning.
The technical scheme that the embodiment of the present invention is taken also includes: described model compression module carries out mould to the dcnn after migration
Type compression specifically includes: first by the iterative strategy of beta pruning-retraining, carries out beta pruning, using total in described trunk model
Loss function carries out parameter learning to the connection of the non-zero setting in trunk model and extra branch;Then train from target at random
Concentrate sample drawn subset as the input of trunk model, obtain activation value on full articulamentum for the described sample drawn subset,
Cut the relatively low neuron of significance, and carry out retraining using total losses function, model compression is repeatedly completed with this iteration.
With respect to prior art, what the embodiment of the present invention produced has the beneficial effects that: the depth convolution of the embodiment of the present invention
Neural network training method and device utilize the mutual supplement with each other's advantages of transfer learning method and model compression technology, in extensive source data
During collecting the transfer learning of small-scale target data set, model compression and beta pruning are carried out to dcnn, thus improve migration learning
Habit ability, to reduce over-fitting risk on small-scale target data set for the dcnn and deployment difficulty, improves model in number of targets
According to the predictive ability on collection.The compression dcnn being obtained by the present invention, is applicable to mobile terminal, embedded equipment, robot
In the high technology industry field calculating with constrained storage, there is higher economical and practical value.
Brief description
Fig. 1 is the flow chart of the depth convolutional neural networks training method of the embodiment of the present invention;
Fig. 2 is the structural representation of the depth convolutional neural networks training devicess of the embodiment of the present invention.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, below in conjunction with drawings and Examples, right
The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only in order to explain the present invention, not
For limiting the present invention.
The depth convolutional neural networks training method of the embodiment of the present invention and device utilize transfer learning method and model pressure
The mutual supplement with each other's advantages of contracting technology, during extensive set of source data to the transfer learning of small-scale target data set, enters to dcnn
Row model compression and beta pruning, thus improving transfer learning ability, to reduce over-fitting on small-scale target data set for the dcnn
Risk and deployment difficulty, improve its Forecasting recognition rate.
Specifically, refer to Fig. 1, be the flow chart of the depth convolutional neural networks training method of the embodiment of the present invention.This
The depth convolutional neural networks training method of inventive embodiments comprises the following steps:
Step 100: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to this dcnn, obtain
To rarefaction dcnn;
In step 100, pre-training being carried out to dcnn particularly as follows: using extensive set of source data, being calculated by back propagation
Method and gradient descent method, carry out pre-training to a dcnn.Model beta pruning is carried out to dcnn particularly as follows: using beta pruning-retraining
Iterative strategy carry out model beta pruning.Each iteration is divided into two steps, and the first step is model beta pruning, by significance in this dcnn
(as absolute value) relatively low Model Weight parameter is set to zero.Thus, these parameters corresponding neutral net connection will no longer
Dcnn works, network structure becomes sparse, and then reach model beta pruning effect.The second step of iteration is that model is instructed again
Practice, the dcnn after beta pruning is trained using back-propagation algorithm and gradient descent method, enters just for the model parameter not being zeroed out
Row training.By the process of such a iteration for several times, realize on the premise of dcnn classification performance is not affected, as many as possible
The connection of network is deleted on ground, with rarefaction network, reduces model redundancy.
Step 200: the Explicit Knowledge of target data set and the tacit knowledge of set of source data are utilized on the dcnn of beta pruning
Carry out transfer learning, dcnn is transferred to aiming field;
In step 200, transfer learning is carried out on the dcnn of beta pruning and specifically includes following steps:
Step 201: modification rarefaction dcnn output layer be target data set classification, and by output layer and near
The full articulamentum of output layer reverts to densification, and the Model Weight parameter of the full articulamentum near output layer is carried out at random just
Beginningization;
In step 201, need to carry out the fine and close number recovering with the full articulamentum reinitializing non-constant, it is
The figure of merit changes because of the difference of the factors such as task, neural network structure.Carry out the full articulamentum of this operation in the embodiment of the present invention
Number be preferably two-layer.
Step 202: refine the tacit knowledge with regard to target data set in set of source data, dominant using small-scale data set
Knowledge and its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realizes transfer learning;
In step 202., in order to improve model prediction performance, the embodiment of the present invention outside target data set, by source number
It is introduced into transfer learning according to concentrating the tacit knowledge with regard to target data set.Specifically, the embodiment of the present invention is to dcnn model
Carry out following modification:
1st, using amended rarefaction dcnn as trunk model;Target data is inputted this trunk model, output layer meeting
Export the prediction probability about target data.
2nd, using pre-training dcnn in set of source data as tacit knowledge reference model;Target data is inputted this with reference to mould
Type, output layer can export soft labels and (add temperature parameter t) in softmax function, this soft labels corresponds to source data class
Other information, it contains the tacit knowledge in source data with regard to target data.
3rd, replicated by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer,
As the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model.This place is replicated
Full articulamentum the number of plies should with step 201 in carry out the fine and close number of plies phase recovered with the full articulamentum reinitializing operation
With.Target data is inputted and after trunk model, passes through this additional branches, its output layer can export tacit knowledge in relevant source data
Soft prediction probability (in softmax function add temperature parameter t).
4th, compared with the correspondence markings of target training set using the prediction of trunk model, design main loss function;Using volume
The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function, this extraneoas loss function
It is mainly used in extracting the tacit knowledge with regard to target data in source data from reference model;Total losses function is main loss function
Weighted sum with extraneoas loss function;Using described total losses function, to trunk model and additional branches on target data set
Carry out model training using back-propagation algorithm, realize transfer learning.
Step 300: carry out model compression using the dcnn after the migration of small-scale target data set pair;
In step 300, after the dcnn of rarefaction transfers to aiming field by transfer learning, using small-scale number of targets
It is compressed according to the dcnn after set pair migration, so that rarefaction dcnn generating reduces the redundancy on aiming field further, carry
Predictive ability on target data set for the high model.Specifically, due to finally only needing to using trunk model to target detection collection
It is predicted assessing, so the embodiment of the present invention carries out model compression just for trunk model.In Compression Strategies and step 100
The iterative strategy of beta pruning-retraining is similar to, but in each iteration, beta pruning is only carried out in trunk model, and retraining is then
Connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parameter learning.
After completing beta pruning-retraining, cut the partial nerve unit of full articulamentum in trunk model, thus compressing further
Scale of model.Specifically, compress mode includes: randomly from target training set sample drawn subset as trunk model
Input, obtains activation value on certain full articulamentum for these sample drawns with this.For this full articulamentum, cut notable first
Property (as average activation value) relatively low neuron, then carries out retraining using total losses function, repeatedly completes mould with this iteration
Type compresses.
Refer to Fig. 2, be the structural representation of the depth convolutional neural networks training devicess of the embodiment of the present invention.The present invention
The depth convolutional neural networks training devicess of embodiment include model pre-training module, transfer learning module and model compression mould
Block.
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and this dcnn is carried out
Model beta pruning, obtains rarefaction dcnn;Wherein, model pre-training module carries out pre-training to dcnn particularly as follows: using extensive
Set of source data, by back-propagation algorithm and gradient descent method, carries out pre-training to dcnn.Model pre-training module is entered to dcnn
Row model beta pruning is particularly as follows: carry out model beta pruning using the iterative strategy of beta pruning-retraining.Each iteration is divided into two steps, and first
Step is model beta pruning, and Model Weight parameter relatively low for significance (as absolute value) in this dcnn is set to zero.Thus, this
A little corresponding neutral nets of parameter connect and will no longer work in dcnn, and then reach model beta pruning effect.The second of iteration
Step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, that is, just for not being zeroed out
Model parameter be trained.By the process of such a iteration for several times, realize before dcnn classification performance is not affected
Put, delete the connection of network as much as possible, with rarefaction network, reduce model redundancy.
Transfer learning module: for being entered using the tacit knowledge of target data set and set of source data on the dcnn of beta pruning
Row transfer learning, dcnn is transferred to aiming field;Specifically, transfer learning module includes model modification unit and model fine setting is single
Unit;
Model modification unit: the output layer for changing rarefaction dcnn is the classification of target data set, and by output layer
And revert to densification, and the Model Weight parameter to the full articulamentum near output layer near the full articulamentum of output layer
Carry out random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale number
According to Explicit Knowledge and its tacit knowledge in source data set of collection, the dcnn of fine setting training rarefaction, realize transfer learning.
In embodiments of the present invention, in order to improve model prediction performance, the embodiment of the present invention, will outside target data set
Source data set is introduced into transfer learning with regard to the tacit knowledge of target data set.Specifically, the embodiment of the present invention is to dcnn
Model carries out following modification:
1st, using amended rarefaction dcnn as trunk model;Target data is inputted this trunk model, output layer meeting
Export the prediction probability about target data.
2nd, using pre-training dcnn in set of source data as tacit knowledge reference model;Target data is inputted this with reference to mould
Type, output layer can export soft labels and (add temperature parameter t) in softmax function, this soft labels corresponds to source data class
Other information, it contains the tacit knowledge in source data with regard to target data.
3rd, replicated by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer,
As the additional branches of described trunk model, described additional branches are placed on the respective layer of trunk model.By target data
Pass through this additional branches, its output layer can export the soft prediction probability about tacit knowledge in source data after input trunk model
(in softmax function, add temperature parameter t).
4th, compared with the correspondence markings of target training set using the prediction of trunk model, design main loss function;Using volume
The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function, this extraneoas loss function
It is mainly used in extracting the tacit knowledge with regard to target data in source data from reference model;Total losses function is main loss function
Weighted sum with extraneoas loss function;Using described total losses function, to trunk model and additional branches on target data set
Carry out model training using back-propagation algorithm, realize transfer learning.
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair;Wherein,
After the dcnn of rarefaction transfers to aiming field by transfer learning, entered using the dcnn after the migration of small-scale target data set pair
Row compression, so that rarefaction dcnn generating reduces the redundancy on aiming field further, improves model on target data set
Predictive ability.Specifically, due to finally only needing to target detection collection is predicted assess using trunk model, so this
Bright embodiment carries out model compression just for trunk model.The iterative strategy of the beta pruning-retraining in Compression Strategies and step 100
Similar, but in each iteration, beta pruning is only carried out in trunk model, and retraining is then using total losses function pair trunk mould
The connection of non-zero setting in type and extra branch carry out parameter learning.
After completing beta pruning-retraining, cut the partial nerve unit of full articulamentum in trunk model, thus compressing further
Scale of model.Specifically, compress mode includes: randomly from target training set sample drawn subset as trunk model
Input, obtains activation value on certain full articulamentum for these sample drawns with this.For this full articulamentum, cut notable first
Property (as average activation value) relatively low neuron, then carries out retraining using total losses function, repeatedly completes mould with this iteration
Type compresses.
In order to prove that the present invention's is practical, we are tested using the scene Recognition task with extensive using value
Card.In experiment, imagenet ilsvrc12 object image data collection (comprised million width images) is used as extensive source number
According to collection, (comprise 15,620 width figures using mit indoor scene recognition database scene image data storehouse
Picture) as small-scale target data set.In addition, selecting wide variety of alexnet model (5 layers of convolutional layer, 3 layers of full articulamentum)
[krizheyshky et al., 2012], is verified as dcnn model.With regard in the embodiment of the present invention, there is highest respectively
Can be inquired into the model of highest compression ratio, result is as shown in Table 1.Each group is tested all using mit indoor scene
The standard testing collection of recognition database carries out accuracy assessment.
The scene Recognition accuracy of table 1 each group experiment and compression ratio
As shown in Table 1, more traditional method for trimming, the present invention not only can in transfer learning significantly compression depth
Neutral net, to reduce over-fitting risk on small-scale target data set for the dcnn and deployment difficulty, but also can improve
The Forecasting recognition accuracy rate of dcnn after transfer learning.As can be seen here, the present invention be one practicable for small-scale data
The high-performance depth convolutional neural networks training method of collection.
The depth convolutional neural networks training method of the embodiment of the present invention and device utilize transfer learning method and model pressure
The mutual supplement with each other's advantages of contracting technology, during extensive set of source data to the transfer learning of small-scale target data set, enters to dcnn
Row model compression and beta pruning, thus improving transfer learning ability, to reduce over-fitting on small-scale target data set for the dcnn
Risk and deployment difficulty, improve predictive ability on target data set for the model.The compression dcnn being obtained by the present invention, can fit
For the high technology industry field of the calculating such as mobile terminal, embedded equipment, robot and constrained storage, there is higher economy real
With being worth.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention.
Multiple modifications to these embodiments will be apparent from for those skilled in the art, as defined herein
General Principle can be realized without departing from the spirit or scope of the present invention in other embodiments.Therefore, the present invention
It is not intended to be limited to the embodiments shown herein, and be to fit to and principles disclosed herein and features of novelty phase one
The scope the widest causing.
Claims (10)
1. a kind of depth convolutional neural networks training method is it is characterised in that comprise the following steps:
Step a: in extensive set of source data, pre-training is carried out to dcnn, and model beta pruning is carried out to described dcnn;
Step b: transfer learning is carried out on the dcnn of beta pruning;
Step c: carry out model compression using the dcnn after the migration of small-scale target data set pair.
2. depth convolutional neural networks training method according to claim 1 is it is characterised in that in described step a, institute
State and pre-training is carried out to dcnn particularly as follows: using extensive set of source data, by back-propagation algorithm and gradient descent method, to institute
State dcnn and carry out pre-training;Described model beta pruning is carried out to dcnn particularly as follows: carrying out mould using the iterative strategy of beta pruning-retraining
Type beta pruning, each iteration is divided into two steps, and the first step is model beta pruning, by Model Weight parameter relatively low for significance in this dcnn
It is set to zero;Second step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains dilute
Thinization dcnn.
3. depth convolutional neural networks training method according to claim 2 is it is characterised in that in described step b, institute
State and transfer learning is carried out on the dcnn of beta pruning specifically include:
Step b1: the output layer changing described rarefaction dcnn is the classification of target data set, by output layer and near defeated
The full articulamentum going out layer reverts to densification, and the Model Weight parameter of the described full articulamentum near output layer is carried out at random
Initialization;
Step b2: refine the tacit knowledge with regard to target data set in set of source data, using the Explicit Knowledge of small-scale data set
And its tacit knowledge in source data set, the dcnn of fine setting training rarefaction, realize transfer learning.
4. depth convolutional neural networks training method according to claim 3 is it is characterised in that in described step b, institute
State and transfer learning is carried out on the dcnn of beta pruning specifically also include:
Step b3: using amended rarefaction dcnn as trunk model;
Step b4: using pre-training dcnn in set of source data as tacit knowledge reference model;
Step b5: carry out again by the output layer in described tacit knowledge reference model and near the full articulamentum of output layer
System, as the additional branches of described trunk model, described additional branches is placed on the respective layer of trunk model;
Step b6: the prediction using trunk model is compared with the correspondence markings of target training set, designs main loss function;Using volume
The prediction of outer branch is compared with the corresponding output of tacit knowledge reference model, designs extraneoas loss function;Total losses function is main
Loss function and the weighted sum of extraneoas loss function;Using described total losses function, on target data set to trunk model and
Additional branches carry out model training using back-propagation algorithm, realize transfer learning.
5. depth convolutional neural networks training method according to claim 4 is it is characterised in that in described step c, institute
State and the dcnn after migration is carried out by model compression specifically includes: first by the iterative strategy of beta pruning-retraining, in described trunk
Carry out beta pruning, connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parametrics in model
Practise;Then from target training set, sample drawn subset, as the input of trunk model, obtains described sample drawn subset at random
Activation value on full articulamentum, is cut the relatively low neuron of significance, and carries out retraining using total losses function, changed with this
In generation, repeatedly completes model compression.
6. a kind of depth convolutional neural networks training devicess are it is characterised in that include:
Model pre-training module: for pre-training being carried out to dcnn in extensive set of source data, and mould is carried out to described dcnn
Type beta pruning;
Transfer learning module: for transfer learning is carried out on the dcnn of beta pruning;
Model compression module: for carrying out model compression using the dcnn after the migration of small-scale target data set pair.
7. depth convolutional neural networks training devicess according to claim 6 are it is characterised in that described model pre-training mould
Block carries out pre-training to dcnn particularly as follows: using extensive set of source data, by back-propagation algorithm and gradient descent method, to institute
State dcnn and carry out pre-training;Described model beta pruning is carried out to dcnn particularly as follows: carrying out mould using the iterative strategy of beta pruning-retraining
Type beta pruning, each iteration is divided into two steps, and the first step is model beta pruning, by Model Weight parameter relatively low for significance in this dcnn
It is set to zero;Second step is model retraining, trains the dcnn after beta pruning using back-propagation algorithm and gradient descent method, obtains dilute
Thinization dcnn.
8. depth convolutional neural networks training devicess according to claim 7 are it is characterised in that described transfer learning module
Including:
Model modification unit: for change described rarefaction dcnn output layer be target data set classification, by output layer with
And revert to densification near the full articulamentum of output layer, and the Model Weight parameter of the full articulamentum near output layer is entered
Row random initializtion;
Model fine-adjusting unit: for refining the tacit knowledge with regard to target data set in set of source data, using small-scale data set
Explicit Knowledge and its tacit knowledge in source data set, fine setting training rarefaction dcnn, realize transfer learning.
9. depth convolutional neural networks training devicess according to claim 8 are it is characterised in that described transfer learning module
Transfer learning is carried out on the dcnn of beta pruning specifically also include: using amended rarefaction dcnn as trunk model;By source
Pre-training dcnn on data set is as tacit knowledge reference model;By the output layer in described tacit knowledge reference model and
Full articulamentum near output layer is replicated, and as the additional branches of described trunk model, described additional branches is placed in
On the respective layer of trunk model;Prediction using trunk model is compared with the correspondence markings of target training set, designs main loss
Function;Compared with the corresponding output of tacit knowledge reference model using the prediction of additional branches, design extraneoas loss function;Total damage
Lose the weighted sum that function is main loss function and extraneoas loss function;Using described total losses function, right on target data set
Trunk model and additional branches carry out model training using back-propagation algorithm, realize transfer learning.
10. depth convolutional neural networks training devicess according to claim 9 are it is characterised in that described model compression mould
Block carries out model compression to the dcnn after migration and specifically includes: first by the iterative strategy of beta pruning-retraining, in described trunk
Carry out beta pruning, connection and extra branch using the non-zero setting in total losses function pair trunk model carry out parametrics in model
Practise;Then from target training set, sample drawn subset, as the input of trunk model, obtains described sample drawn subset at random
Activation value on full articulamentum, is cut the relatively low neuron of significance, and carries out retraining using total losses function, changed with this
In generation, repeatedly completes model compression.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610738135.7A CN106355248A (en) | 2016-08-26 | 2016-08-26 | Deep convolution neural network training method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610738135.7A CN106355248A (en) | 2016-08-26 | 2016-08-26 | Deep convolution neural network training method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106355248A true CN106355248A (en) | 2017-01-25 |
Family
ID=57855127
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610738135.7A Pending CN106355248A (en) | 2016-08-26 | 2016-08-26 | Deep convolution neural network training method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106355248A (en) |
Cited By (69)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107102644A (en) * | 2017-06-22 | 2017-08-29 | 华南师范大学 | The underwater robot method for controlling trajectory and control system learnt based on deeply |
CN107239802A (en) * | 2017-06-28 | 2017-10-10 | 广东工业大学 | A kind of image classification method and device |
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107392241A (en) * | 2017-07-17 | 2017-11-24 | 北京邮电大学 | A kind of image object sorting technique that sampling XGBoost is arranged based on weighting |
CN107480611A (en) * | 2017-07-31 | 2017-12-15 | 浙江大学 | A kind of crack identification method based on deep learning convolutional neural networks |
CN107491790A (en) * | 2017-08-25 | 2017-12-19 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN108108662A (en) * | 2017-11-24 | 2018-06-01 | 深圳市华尊科技股份有限公司 | Deep neural network identification model and recognition methods |
CN108230354A (en) * | 2017-05-18 | 2018-06-29 | 深圳市商汤科技有限公司 | Target following, network training method, device, electronic equipment and storage medium |
CN108229682A (en) * | 2018-02-07 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of image detection countercheck based on backpropagation attack |
CN108334934A (en) * | 2017-06-07 | 2018-07-27 | 北京深鉴智能科技有限公司 | Convolutional neural networks compression method based on beta pruning and distillation |
CN108446724A (en) * | 2018-03-12 | 2018-08-24 | 江苏中天科技软件技术有限公司 | A kind of fusion feature sorting technique |
CN108459585A (en) * | 2018-04-09 | 2018-08-28 | 东南大学 | Power station fan method for diagnosing faults based on sparse locally embedding depth convolutional network |
CN108573287A (en) * | 2018-05-11 | 2018-09-25 | 浙江工业大学 | A kind of training method of the image codec based on deep neural network |
CN108596243A (en) * | 2018-04-20 | 2018-09-28 | 西安电子科技大学 | The eye movement for watching figure and condition random field attentively based on classification watches figure prediction technique attentively |
CN108629288A (en) * | 2018-04-09 | 2018-10-09 | 华中科技大学 | A kind of gesture identification model training method, gesture identification method and system |
CN108805258A (en) * | 2018-05-23 | 2018-11-13 | 北京图森未来科技有限公司 | A kind of neural network training method and its device, computer server |
CN108876774A (en) * | 2018-06-07 | 2018-11-23 | 浙江大学 | A kind of people counting method based on convolutional neural networks |
CN108960415A (en) * | 2017-05-23 | 2018-12-07 | 上海寒武纪信息科技有限公司 | Processing unit and processing system |
CN109034385A (en) * | 2017-06-12 | 2018-12-18 | 辉达公司 | With the system and method for sparse data training neural network |
CN109063835A (en) * | 2018-07-11 | 2018-12-21 | 中国科学技术大学 | The compression set and method of neural network |
CN109272118A (en) * | 2018-08-10 | 2019-01-25 | 北京达佳互联信息技术有限公司 | Data training method, device, equipment and storage medium |
CN109376615A (en) * | 2018-09-29 | 2019-02-22 | 苏州科达科技股份有限公司 | For promoting the method, apparatus and storage medium of deep learning neural network forecast performance |
CN109472274A (en) * | 2017-09-07 | 2019-03-15 | 富士通株式会社 | The training device and method of deep learning disaggregated model |
CN109492754A (en) * | 2018-11-06 | 2019-03-19 | 深圳市友杰智新科技有限公司 | One kind is based on deep neural network model compression and accelerated method |
CN109522949A (en) * | 2018-11-07 | 2019-03-26 | 北京交通大学 | Model of Target Recognition method for building up and device |
CN109615858A (en) * | 2018-12-21 | 2019-04-12 | 深圳信路通智能技术有限公司 | A kind of intelligent parking behavior judgment method based on deep learning |
CN109635288A (en) * | 2018-11-29 | 2019-04-16 | 东莞理工学院 | A kind of resume abstracting method based on deep neural network |
CN109685120A (en) * | 2018-12-11 | 2019-04-26 | 中科恒运股份有限公司 | Quick training method and terminal device of the disaggregated model under finite data |
CN109725531A (en) * | 2018-12-13 | 2019-05-07 | 中南大学 | A kind of successive learning method based on gate making mechanism |
CN109726045A (en) * | 2017-10-27 | 2019-05-07 | 百度(美国)有限责任公司 | System and method for the sparse recurrent neural network of block |
CN109815864A (en) * | 2019-01-11 | 2019-05-28 | 浙江工业大学 | A kind of facial image age recognition methods based on transfer learning |
WO2019100998A1 (en) * | 2017-11-24 | 2019-05-31 | 腾讯科技(深圳)有限公司 | Voice signal processing model training method, electronic device, and storage medium |
WO2019106619A1 (en) * | 2017-11-30 | 2019-06-06 | International Business Machines Corporation | Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression |
CN109960581A (en) * | 2017-12-26 | 2019-07-02 | 广东欧珀移动通信有限公司 | Hardware resource configuration method, device, mobile terminal and storage medium |
CN110008854A (en) * | 2019-03-18 | 2019-07-12 | 中交第二公路勘察设计研究院有限公司 | Unmanned plane image Highway Geological Disaster recognition methods based on pre-training DCNN |
CN110008880A (en) * | 2019-03-27 | 2019-07-12 | 深圳前海微众银行股份有限公司 | A kind of model compression method and device |
CN110059717A (en) * | 2019-03-13 | 2019-07-26 | 山东大学 | Convolutional neural networks automatic division method and system for breast molybdenum target data set |
CN110084365A (en) * | 2019-03-13 | 2019-08-02 | 西安电子科技大学 | A kind of service provider system and method based on deep learning |
CN110096976A (en) * | 2019-04-18 | 2019-08-06 | 中国人民解放军国防科技大学 | Human behavior micro-Doppler classification method based on sparse migration network |
CN110245587A (en) * | 2019-05-29 | 2019-09-17 | 西安交通大学 | A kind of remote sensing image object detection method based on Bayes's transfer learning |
CN110348422A (en) * | 2019-07-18 | 2019-10-18 | 北京地平线机器人技术研发有限公司 | Image processing method, device, computer readable storage medium and electronic equipment |
WO2019205604A1 (en) * | 2018-04-25 | 2019-10-31 | 北京市商汤科技开发有限公司 | Image processing method, training method, apparatus, device, medium and program |
WO2019205391A1 (en) * | 2018-04-26 | 2019-10-31 | 平安科技(深圳)有限公司 | Apparatus and method for generating vehicle damage classification model, and computer readable storage medium |
CN110580523A (en) * | 2018-06-07 | 2019-12-17 | 清华大学 | Error calibration method and device for analog neural network processor |
CN110647977A (en) * | 2019-08-26 | 2020-01-03 | 北京空间机电研究所 | Method for optimizing Tiny-YOLO network for detecting ship target on satellite |
CN110648531A (en) * | 2019-09-19 | 2020-01-03 | 军事科学院系统工程研究院网络信息研究所 | Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network |
WO2020019102A1 (en) * | 2018-07-23 | 2020-01-30 | Intel Corporation | Methods, systems, articles of manufacture and apparatus to train a neural network |
CN110799996A (en) * | 2017-06-30 | 2020-02-14 | 康蒂-特米克微电子有限公司 | Knowledge transfer between different deep learning architectures |
CN110858253A (en) * | 2018-08-17 | 2020-03-03 | 第四范式(北京)技术有限公司 | Method and system for executing machine learning under data privacy protection |
CN110929839A (en) * | 2018-09-20 | 2020-03-27 | 深圳市商汤科技有限公司 | Method and apparatus for training neural network, electronic device, and computer storage medium |
CN111091177A (en) * | 2019-11-12 | 2020-05-01 | 腾讯科技(深圳)有限公司 | Model compression method and device, electronic equipment and storage medium |
CN111134662A (en) * | 2020-02-17 | 2020-05-12 | 武汉大学 | Electrocardio abnormal signal identification method and device based on transfer learning and confidence degree selection |
CN111291841A (en) * | 2020-05-13 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Image recognition model training method and device, computer equipment and storage medium |
CN111310520A (en) * | 2018-12-11 | 2020-06-19 | 阿里巴巴集团控股有限公司 | Dish identification method, cash registering method, dish order prompting method and related device |
TWI700647B (en) * | 2018-09-11 | 2020-08-01 | 國立清華大學 | Electronic apparatus and compression method for artificial neural network |
CN109407654B (en) * | 2018-12-20 | 2020-08-04 | 浙江大学 | Industrial data nonlinear causal analysis method based on sparse deep neural network |
CN111767996A (en) * | 2018-02-27 | 2020-10-13 | 上海寒武纪信息科技有限公司 | Integrated circuit chip device and related product |
CN111931698A (en) * | 2020-09-08 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Image deep learning network construction method and device based on small training set |
CN112001477A (en) * | 2020-06-19 | 2020-11-27 | 南京理工大学 | Deep learning-based model optimization algorithm for target detection YOLOv3 |
CN112329931A (en) * | 2021-01-04 | 2021-02-05 | 北京智源人工智能研究院 | Countermeasure sample generation method and device based on proxy model |
CN112819157A (en) * | 2021-01-29 | 2021-05-18 | 商汤集团有限公司 | Neural network training method and device and intelligent driving control method and device |
CN113222976A (en) * | 2021-05-31 | 2021-08-06 | 河海大学 | Space-time image texture direction detection method and system based on DCNN and transfer learning |
CN107832837B (en) * | 2017-11-28 | 2021-09-28 | 南京大学 | Convolutional neural network compression method and decompression method based on compressed sensing principle |
CN113780535A (en) * | 2021-09-27 | 2021-12-10 | 华中科技大学 | Model training method and system applied to edge equipment |
CN113837376A (en) * | 2021-08-30 | 2021-12-24 | 厦门大学 | Neural network pruning method based on dynamic coding convolution kernel fusion |
US11244226B2 (en) | 2017-06-12 | 2022-02-08 | Nvidia Corporation | Systems and methods for training neural networks with sparse data |
US11347308B2 (en) | 2019-07-26 | 2022-05-31 | Samsung Electronics Co., Ltd. | Method and apparatus with gaze tracking |
WO2022116819A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Model training method and apparatus, machine translation method and apparatus, and device and storage medium |
WO2022127907A1 (en) * | 2020-12-17 | 2022-06-23 | Moffett Technologies Co., Limited | System and method for domain specific neural network pruning |
-
2016
- 2016-08-26 CN CN201610738135.7A patent/CN106355248A/en active Pending
Cited By (110)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108230354B (en) * | 2017-05-18 | 2022-05-10 | 深圳市商汤科技有限公司 | Target tracking method, network training method, device, electronic equipment and storage medium |
CN108230354A (en) * | 2017-05-18 | 2018-06-29 | 深圳市商汤科技有限公司 | Target following, network training method, device, electronic equipment and storage medium |
CN108960415B (en) * | 2017-05-23 | 2021-04-20 | 上海寒武纪信息科技有限公司 | Processing apparatus and processing system |
CN108960415A (en) * | 2017-05-23 | 2018-12-07 | 上海寒武纪信息科技有限公司 | Processing unit and processing system |
WO2018223822A1 (en) * | 2017-06-07 | 2018-12-13 | 北京深鉴智能科技有限公司 | Pruning- and distillation-based convolutional neural network compression method |
CN108334934A (en) * | 2017-06-07 | 2018-07-27 | 北京深鉴智能科技有限公司 | Convolutional neural networks compression method based on beta pruning and distillation |
US11244226B2 (en) | 2017-06-12 | 2022-02-08 | Nvidia Corporation | Systems and methods for training neural networks with sparse data |
CN109034385A (en) * | 2017-06-12 | 2018-12-18 | 辉达公司 | With the system and method for sparse data training neural network |
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
US11625594B2 (en) | 2017-06-15 | 2023-04-11 | Beijing Tusen Zhitu Technology Co., Ltd. | Method and device for student training networks with teacher networks |
CN107247989B (en) * | 2017-06-15 | 2020-11-24 | 北京图森智途科技有限公司 | Real-time computer vision processing method and device |
CN107102644B (en) * | 2017-06-22 | 2019-12-10 | 华南师范大学 | Underwater robot track control method and control system based on deep reinforcement learning |
CN107102644A (en) * | 2017-06-22 | 2017-08-29 | 华南师范大学 | The underwater robot method for controlling trajectory and control system learnt based on deeply |
CN107239802A (en) * | 2017-06-28 | 2017-10-10 | 广东工业大学 | A kind of image classification method and device |
CN107239802B (en) * | 2017-06-28 | 2021-06-01 | 广东工业大学 | Image classification method and device |
CN110799996A (en) * | 2017-06-30 | 2020-02-14 | 康蒂-特米克微电子有限公司 | Knowledge transfer between different deep learning architectures |
CN107392241A (en) * | 2017-07-17 | 2017-11-24 | 北京邮电大学 | A kind of image object sorting technique that sampling XGBoost is arranged based on weighting |
CN107480611A (en) * | 2017-07-31 | 2017-12-15 | 浙江大学 | A kind of crack identification method based on deep learning convolutional neural networks |
CN107480611B (en) * | 2017-07-31 | 2020-06-30 | 浙江大学 | Crack identification method based on deep learning convolutional neural network |
CN107491790A (en) * | 2017-08-25 | 2017-12-19 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN109472274B (en) * | 2017-09-07 | 2022-06-28 | 富士通株式会社 | Training device and method for deep learning classification model |
CN109472274A (en) * | 2017-09-07 | 2019-03-15 | 富士通株式会社 | The training device and method of deep learning disaggregated model |
CN109726045A (en) * | 2017-10-27 | 2019-05-07 | 百度(美国)有限责任公司 | System and method for the sparse recurrent neural network of block |
US11651223B2 (en) | 2017-10-27 | 2023-05-16 | Baidu Usa Llc | Systems and methods for block-sparse recurrent neural networks |
CN109726045B (en) * | 2017-10-27 | 2023-07-25 | 百度(美国)有限责任公司 | System and method for block sparse recurrent neural network |
US11158304B2 (en) | 2017-11-24 | 2021-10-26 | Tencent Technology (Shenzhen) Company Limited | Training method of speech signal processing model with shared layer, electronic device and storage medium |
CN108108662A (en) * | 2017-11-24 | 2018-06-01 | 深圳市华尊科技股份有限公司 | Deep neural network identification model and recognition methods |
WO2019100998A1 (en) * | 2017-11-24 | 2019-05-31 | 腾讯科技(深圳)有限公司 | Voice signal processing model training method, electronic device, and storage medium |
CN107832837B (en) * | 2017-11-28 | 2021-09-28 | 南京大学 | Convolutional neural network compression method and decompression method based on compressed sensing principle |
JP7300798B2 (en) | 2017-11-30 | 2023-06-30 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Systems, methods, computer programs, and computer readable storage media for compressing neural network data |
JP2021504837A (en) * | 2017-11-30 | 2021-02-15 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Fully connected / regression deep network compression through enhancing spatial locality to the weight matrix and providing frequency compression |
WO2019106619A1 (en) * | 2017-11-30 | 2019-06-06 | International Business Machines Corporation | Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression |
CN111357019B (en) * | 2017-11-30 | 2023-12-29 | 国际商业机器公司 | Compressing fully connected/recursive layers of depth network(s) by implementing spatial locality on weight matrices and implementing frequency compression |
CN111357019A (en) * | 2017-11-30 | 2020-06-30 | 国际商业机器公司 | Compressing fully connected/recursive layers of deep network(s) by enforcing spatial locality on weight matrices and implementing frequency compression |
GB2582233A (en) * | 2017-11-30 | 2020-09-16 | Ibm | Compression of fully connected/recurrent layers of deep network(s) through enforcing spatial locality to weight matrices and effecting frequency compression |
CN109960581B (en) * | 2017-12-26 | 2021-06-01 | Oppo广东移动通信有限公司 | Hardware resource allocation method and device, mobile terminal and storage medium |
CN109960581A (en) * | 2017-12-26 | 2019-07-02 | 广东欧珀移动通信有限公司 | Hardware resource configuration method, device, mobile terminal and storage medium |
CN108229682A (en) * | 2018-02-07 | 2018-06-29 | 深圳市唯特视科技有限公司 | A kind of image detection countercheck based on backpropagation attack |
CN111767996A (en) * | 2018-02-27 | 2020-10-13 | 上海寒武纪信息科技有限公司 | Integrated circuit chip device and related product |
CN111767996B (en) * | 2018-02-27 | 2024-03-05 | 上海寒武纪信息科技有限公司 | Integrated circuit chip device and related products |
CN108446724B (en) * | 2018-03-12 | 2020-06-16 | 江苏中天科技软件技术有限公司 | Fusion feature classification method |
CN108446724A (en) * | 2018-03-12 | 2018-08-24 | 江苏中天科技软件技术有限公司 | A kind of fusion feature sorting technique |
CN108629288A (en) * | 2018-04-09 | 2018-10-09 | 华中科技大学 | A kind of gesture identification model training method, gesture identification method and system |
CN108629288B (en) * | 2018-04-09 | 2020-05-19 | 华中科技大学 | Gesture recognition model training method, gesture recognition method and system |
CN108459585A (en) * | 2018-04-09 | 2018-08-28 | 东南大学 | Power station fan method for diagnosing faults based on sparse locally embedding depth convolutional network |
CN108596243B (en) * | 2018-04-20 | 2021-09-10 | 西安电子科技大学 | Eye movement gaze prediction method based on hierarchical gaze view and conditional random field |
CN108596243A (en) * | 2018-04-20 | 2018-09-28 | 西安电子科技大学 | The eye movement for watching figure and condition random field attentively based on classification watches figure prediction technique attentively |
WO2019205604A1 (en) * | 2018-04-25 | 2019-10-31 | 北京市商汤科技开发有限公司 | Image processing method, training method, apparatus, device, medium and program |
US11334763B2 (en) | 2018-04-25 | 2022-05-17 | Beijing Sensetime Technology Development Co., Ltd. | Image processing methods, training methods, apparatuses, devices, media, and programs |
WO2019205391A1 (en) * | 2018-04-26 | 2019-10-31 | 平安科技(深圳)有限公司 | Apparatus and method for generating vehicle damage classification model, and computer readable storage medium |
CN108573287A (en) * | 2018-05-11 | 2018-09-25 | 浙江工业大学 | A kind of training method of the image codec based on deep neural network |
CN108573287B (en) * | 2018-05-11 | 2021-10-29 | 浙江工业大学 | Deep neural network-based image codec training method |
CN108805258A (en) * | 2018-05-23 | 2018-11-13 | 北京图森未来科技有限公司 | A kind of neural network training method and its device, computer server |
CN108876774A (en) * | 2018-06-07 | 2018-11-23 | 浙江大学 | A kind of people counting method based on convolutional neural networks |
CN110580523A (en) * | 2018-06-07 | 2019-12-17 | 清华大学 | Error calibration method and device for analog neural network processor |
CN109063835B (en) * | 2018-07-11 | 2021-07-09 | 中国科学技术大学 | Neural network compression device and method |
CN109063835A (en) * | 2018-07-11 | 2018-12-21 | 中国科学技术大学 | The compression set and method of neural network |
WO2020019102A1 (en) * | 2018-07-23 | 2020-01-30 | Intel Corporation | Methods, systems, articles of manufacture and apparatus to train a neural network |
CN109272118A (en) * | 2018-08-10 | 2019-01-25 | 北京达佳互联信息技术有限公司 | Data training method, device, equipment and storage medium |
CN110858253A (en) * | 2018-08-17 | 2020-03-03 | 第四范式(北京)技术有限公司 | Method and system for executing machine learning under data privacy protection |
TWI700647B (en) * | 2018-09-11 | 2020-08-01 | 國立清華大學 | Electronic apparatus and compression method for artificial neural network |
US11270207B2 (en) | 2018-09-11 | 2022-03-08 | National Tsing Hua University | Electronic apparatus and compression method for artificial neural network |
CN110929839A (en) * | 2018-09-20 | 2020-03-27 | 深圳市商汤科技有限公司 | Method and apparatus for training neural network, electronic device, and computer storage medium |
CN109376615A (en) * | 2018-09-29 | 2019-02-22 | 苏州科达科技股份有限公司 | For promoting the method, apparatus and storage medium of deep learning neural network forecast performance |
CN109376615B (en) * | 2018-09-29 | 2020-12-18 | 苏州科达科技股份有限公司 | Method, device and storage medium for improving prediction performance of deep learning network |
CN109492754A (en) * | 2018-11-06 | 2019-03-19 | 深圳市友杰智新科技有限公司 | One kind is based on deep neural network model compression and accelerated method |
CN109522949A (en) * | 2018-11-07 | 2019-03-26 | 北京交通大学 | Model of Target Recognition method for building up and device |
CN109522949B (en) * | 2018-11-07 | 2021-01-26 | 北京交通大学 | Target recognition model establishing method and device |
CN109635288B (en) * | 2018-11-29 | 2023-05-23 | 东莞理工学院 | Resume extraction method based on deep neural network |
CN109635288A (en) * | 2018-11-29 | 2019-04-16 | 东莞理工学院 | A kind of resume abstracting method based on deep neural network |
CN111310520A (en) * | 2018-12-11 | 2020-06-19 | 阿里巴巴集团控股有限公司 | Dish identification method, cash registering method, dish order prompting method and related device |
CN109685120A (en) * | 2018-12-11 | 2019-04-26 | 中科恒运股份有限公司 | Quick training method and terminal device of the disaggregated model under finite data |
CN111310520B (en) * | 2018-12-11 | 2023-11-21 | 阿里巴巴集团控股有限公司 | Dish identification method, cashing method, dish ordering method and related devices |
CN109725531B (en) * | 2018-12-13 | 2021-09-21 | 中南大学 | Continuous learning method based on door control mechanism |
CN109725531A (en) * | 2018-12-13 | 2019-05-07 | 中南大学 | A kind of successive learning method based on gate making mechanism |
CN109407654B (en) * | 2018-12-20 | 2020-08-04 | 浙江大学 | Industrial data nonlinear causal analysis method based on sparse deep neural network |
CN109615858A (en) * | 2018-12-21 | 2019-04-12 | 深圳信路通智能技术有限公司 | A kind of intelligent parking behavior judgment method based on deep learning |
CN109815864A (en) * | 2019-01-11 | 2019-05-28 | 浙江工业大学 | A kind of facial image age recognition methods based on transfer learning |
CN109815864B (en) * | 2019-01-11 | 2021-01-01 | 浙江工业大学 | Facial image age identification method based on transfer learning |
CN110084365A (en) * | 2019-03-13 | 2019-08-02 | 西安电子科技大学 | A kind of service provider system and method based on deep learning |
CN110084365B (en) * | 2019-03-13 | 2023-08-11 | 西安电子科技大学 | Service providing system and method based on deep learning |
CN110059717A (en) * | 2019-03-13 | 2019-07-26 | 山东大学 | Convolutional neural networks automatic division method and system for breast molybdenum target data set |
CN110008854A (en) * | 2019-03-18 | 2019-07-12 | 中交第二公路勘察设计研究院有限公司 | Unmanned plane image Highway Geological Disaster recognition methods based on pre-training DCNN |
CN110008854B (en) * | 2019-03-18 | 2021-04-30 | 中交第二公路勘察设计研究院有限公司 | Unmanned aerial vehicle image highway geological disaster identification method based on pre-training DCNN |
CN110008880A (en) * | 2019-03-27 | 2019-07-12 | 深圳前海微众银行股份有限公司 | A kind of model compression method and device |
CN110008880B (en) * | 2019-03-27 | 2023-09-29 | 深圳前海微众银行股份有限公司 | Model compression method and device |
CN110096976A (en) * | 2019-04-18 | 2019-08-06 | 中国人民解放军国防科技大学 | Human behavior micro-Doppler classification method based on sparse migration network |
CN110245587A (en) * | 2019-05-29 | 2019-09-17 | 西安交通大学 | A kind of remote sensing image object detection method based on Bayes's transfer learning |
CN110348422B (en) * | 2019-07-18 | 2021-11-09 | 北京地平线机器人技术研发有限公司 | Image processing method, image processing device, computer-readable storage medium and electronic equipment |
CN110348422A (en) * | 2019-07-18 | 2019-10-18 | 北京地平线机器人技术研发有限公司 | Image processing method, device, computer readable storage medium and electronic equipment |
US11347308B2 (en) | 2019-07-26 | 2022-05-31 | Samsung Electronics Co., Ltd. | Method and apparatus with gaze tracking |
CN110647977A (en) * | 2019-08-26 | 2020-01-03 | 北京空间机电研究所 | Method for optimizing Tiny-YOLO network for detecting ship target on satellite |
CN110648531A (en) * | 2019-09-19 | 2020-01-03 | 军事科学院系统工程研究院网络信息研究所 | Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network |
CN110648531B (en) * | 2019-09-19 | 2020-12-04 | 军事科学院系统工程研究院网络信息研究所 | Node mobility prediction method based on deep learning in vehicle-mounted self-organizing network |
CN111091177A (en) * | 2019-11-12 | 2020-05-01 | 腾讯科技(深圳)有限公司 | Model compression method and device, electronic equipment and storage medium |
CN111134662A (en) * | 2020-02-17 | 2020-05-12 | 武汉大学 | Electrocardio abnormal signal identification method and device based on transfer learning and confidence degree selection |
CN111291841A (en) * | 2020-05-13 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Image recognition model training method and device, computer equipment and storage medium |
CN112001477A (en) * | 2020-06-19 | 2020-11-27 | 南京理工大学 | Deep learning-based model optimization algorithm for target detection YOLOv3 |
CN111931698A (en) * | 2020-09-08 | 2020-11-13 | 平安国际智慧城市科技股份有限公司 | Image deep learning network construction method and device based on small training set |
WO2022116819A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Model training method and apparatus, machine translation method and apparatus, and device and storage medium |
WO2022127907A1 (en) * | 2020-12-17 | 2022-06-23 | Moffett Technologies Co., Limited | System and method for domain specific neural network pruning |
CN116438544A (en) * | 2020-12-17 | 2023-07-14 | 墨芯国际有限公司 | System and method for domain-specific neural network pruning |
CN112329931A (en) * | 2021-01-04 | 2021-02-05 | 北京智源人工智能研究院 | Countermeasure sample generation method and device based on proxy model |
CN112329931B (en) * | 2021-01-04 | 2021-05-07 | 北京智源人工智能研究院 | Countermeasure sample generation method and device based on proxy model |
CN112819157A (en) * | 2021-01-29 | 2021-05-18 | 商汤集团有限公司 | Neural network training method and device and intelligent driving control method and device |
CN113222976B (en) * | 2021-05-31 | 2022-08-05 | 河海大学 | Space-time image texture direction detection method and system based on DCNN and transfer learning |
CN113222976A (en) * | 2021-05-31 | 2021-08-06 | 河海大学 | Space-time image texture direction detection method and system based on DCNN and transfer learning |
CN113837376B (en) * | 2021-08-30 | 2023-09-15 | 厦门大学 | Neural network pruning method based on dynamic coding convolution kernel fusion |
CN113837376A (en) * | 2021-08-30 | 2021-12-24 | 厦门大学 | Neural network pruning method based on dynamic coding convolution kernel fusion |
CN113780535A (en) * | 2021-09-27 | 2021-12-10 | 华中科技大学 | Model training method and system applied to edge equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106355248A (en) | Deep convolution neural network training method and device | |
CN109598269A (en) | A kind of semantic segmentation method based on multiresolution input with pyramid expansion convolution | |
CN108921294A (en) | A kind of gradual piece of knowledge distillating method accelerated for neural network | |
CN110222140A (en) | A kind of cross-module state search method based on confrontation study and asymmetric Hash | |
CN109543502A (en) | A kind of semantic segmentation method based on the multiple dimensioned neural network of depth | |
CN110378208B (en) | Behavior identification method based on deep residual error network | |
CN105772407A (en) | Waste classification robot based on image recognition technology | |
CN106203363A (en) | Human skeleton motion sequence Activity recognition method | |
CN111709321B (en) | Human behavior recognition method based on graph convolution neural network | |
CN110222717A (en) | Image processing method and device | |
CN107657204A (en) | The construction method and facial expression recognizing method and system of deep layer network model | |
CN110222634A (en) | A kind of human posture recognition method based on convolutional neural networks | |
CN107145893A (en) | A kind of image recognition algorithm and system based on convolution depth network | |
CN111709289B (en) | Multitask deep learning model for improving human body analysis effect | |
CN107203752A (en) | A kind of combined depth study and the face identification method of the norm constraint of feature two | |
CN109284741A (en) | A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network | |
CN109871892A (en) | A kind of robot vision cognitive system based on small sample metric learning | |
CN110321997A (en) | High degree of parallelism computing platform, system and calculating implementation method | |
CN108564166A (en) | Based on the semi-supervised feature learning method of the convolutional neural networks with symmetrical parallel link | |
CN112651360B (en) | Skeleton action recognition method under small sample | |
CN109102475A (en) | A kind of image rain removing method and device | |
CN113052254A (en) | Multi-attention ghost residual fusion classification model and classification method thereof | |
CN113128424A (en) | Attention mechanism-based graph convolution neural network action identification method | |
Zhang et al. | Skip-attention encoder–decoder framework for human motion prediction | |
CN111612046B (en) | Feature pyramid graph convolution neural network and application thereof in 3D point cloud classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170125 |
|
RJ01 | Rejection of invention patent application after publication |