CN107480773A - The method, apparatus and storage medium of training convolutional neural networks model - Google Patents
The method, apparatus and storage medium of training convolutional neural networks model Download PDFInfo
- Publication number
- CN107480773A CN107480773A CN201710675297.5A CN201710675297A CN107480773A CN 107480773 A CN107480773 A CN 107480773A CN 201710675297 A CN201710675297 A CN 201710675297A CN 107480773 A CN107480773 A CN 107480773A
- Authority
- CN
- China
- Prior art keywords
- hidden layer
- hidden
- layer
- probability
- hiding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The disclosure is directed to a kind of method, apparatus and storage medium of training convolutional neural networks model, it is related to depth learning technology field, this method includes:Each hidden layer in the multiple hidden layers included for convolutional neural networks model, selection target node in multiple nodes that the hiding probability based on hidden layer includes from hidden layer, the hiding probability of the plurality of hidden layer differ;Convolutional neural networks model is trained based on the destination node selected from multiple hidden layers.Because different hidden layers corresponds to different input values, therefore, when using different hiding probability, selection target node is trained from different hidden layers, it is trained compared to probability is hidden using identical to all hidden layers in correlation technique, can effectively improves the image recognition accuracy rate of convolutional neural networks model.
Description
Technical field
This disclosure relates to depth learning technology field, more particularly to a kind of method of training convolutional neural networks model, dress
Put and storage medium.
Background technology
In recent years, depth learning technology is widely used in image recognition and calssification field.Wherein, in depth learning technology
The convolutional neural networks model of use is typically the convolutional network of multilayer.When training the convolutional neural networks model, if
Sample in training set is less, is easy for causing over-fitting, so as to cause the reduction of image recognition accuracy rate.In order to solve above-mentioned ask
Topic, can be trained using Dropout algorithms to convolutional neural networks model.
In correlation technique, convolutional neural networks model can include an input layer, an output layer and multiple hidden layers,
Between input layer and output layer, input layer is connected the plurality of hidden layer with first hidden layer, first hidden layer it is defeated
Go out input value of the value as next hidden layer adjacent thereto, the output valve of last hidden layer is then as the defeated of output layer
Enter value.When being trained using Dropout algorithms to convolutional neural networks model, for every in convolutional neural networks model
Individual hidden layer, can be according to selection target node in multiple nodes that predetermined probabilities include from the hidden layer, and according to more from this
The destination node selected in individual hidden layer is trained to convolutional neural networks model.
The content of the invention
Caused by when being trained to overcome in correlation technique using identical predetermined probabilities to convolutional neural networks model
The problem of image recognition accuracy rate of convolutional neural networks model is relatively low, the disclosure provide a kind of training convolutional neural networks model
Method, apparatus and storage medium.
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of method of training convolutional neural networks model, including:
Each hidden layer in the multiple hidden layers included for convolutional neural networks model, based on the hidden of the hidden layer
Hide selection target node in multiple nodes for including from the hidden layer of probability, the hiding probability of the multiple hidden layer not phase
Together;
The convolutional neural networks model is trained based on the destination node selected from the multiple hidden layer.
Alternatively, mesh is selected in the multiple nodes included based on the hiding probability of the hidden layer from the hidden layer
Before marking node, in addition to:
The hiding probability of each hidden layer in multiple hidden layers that the convolutional neural networks model includes is determined, it is described more
The hiding probability of individual hidden layer raises successively according to the level of abstraction order from high to low of the multiple hidden layer output valve.
Alternatively, each hidden layer is hiding general in multiple hidden layers that the determination convolutional neural networks model includes
Rate, including:
Each hidden layer in the multiple hidden layers included for the convolutional neural networks model, obtains the hidden layer
Output valve;
Based on the output valve of the hidden layer, the hiding probability of the hidden layer is determined.
Alternatively, the output valve based on the hidden layer, the hiding probability of the hidden layer is determined, including:
Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is just whole more than 1
Number;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and multiplying for preset ratio
Product, obtains target quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
Determine the m-th singular value in the ranking results, wherein, preceding M singular value in the ranking results it is flat
Just and more than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is put down less than the target
Fang He, the M are the positive integer more than or equal to 1;
The M and the N ratio are defined as to the hiding probability of the hidden layer.
Alternatively, each hidden layer is hiding general in multiple hidden layers that the determination convolutional neural networks model includes
Rate, including:
The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, first hidden layer is the multiple
The minimum hidden layer of output valve level of abstraction in hidden layer, second hidden layer are that output valve is abstracted in the multiple hidden layer
Degree highest hidden layer;
Based on the output valve of first hidden layer, the hiding probability of first hidden layer is determined, based on described second
The output valve of hidden layer, determine the hiding probability of second hidden layer;
Hiding probability based on first hidden layer and the probability difference between the hiding probability of second hidden layer,
Determine to remove the hiding general of other hidden layers between first hidden layer and second hidden layer in the multiple hidden layer
Rate.
Alternatively, first hidden layer is first hidden layer be connected with input layer, second hidden layer for
Last hidden layer of output layer connection;
Probability between the hiding probability based on first hidden layer and the hiding probability of second hidden layer
Difference, determine the hidden of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer
Probability is hidden, including:
Number based on the multiple hidden layer, the probability difference, first hidden layer hiding probability and
The hiding probability of last hidden layer, it is determined that between first hidden layer and last described hidden layer
Each hidden layer hiding probability.
Alternatively, mesh is selected in the multiple nodes included based on the hiding probability of the hidden layer from the hidden layer
Node is marked, including:
Each node in the multiple nodes included for the hidden layer, it is node generation one according to preset rules
Individual random chance;
When the random chance is less than the hiding probability, the node is defined as destination node.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of training convolutional neural networks model, the dress
Put including:
Selecting module, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, it is based on
Selection target node in multiple nodes that the hiding probable value of the hidden layer includes from the hidden layer, the multiple hidden layer
Hiding probability differ;
Training module, for based on the destination node selected from the multiple hidden layer to the convolutional neural networks mould
Type is trained.
Alternatively, described device also includes:
Determining module, each hidden layer is hidden in the multiple hidden layers included for determining the convolutional neural networks model
Hide probability, the hiding probability of the multiple hidden layer according to the multiple hidden layer output valve level of abstraction from high to low suitable
Sequence raises successively.
Alternatively, the determining module includes:
First acquisition submodule, for each hidden in multiple hidden layers for including for the convolutional neural networks model
Containing layer, the output valve of the hidden layer is obtained;
First determination sub-module, for the output valve based on the hidden layer, determine the hiding probability of the hidden layer.
Alternatively, the determination sub-module is used for:
Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is just whole more than 1
Number;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and multiplying for preset ratio
Product, obtains target quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
Determine the m-th singular value in the ranking results, wherein, preceding M singular value in the ranking results it is flat
Just and more than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is put down less than the target
Fang He, the M are the positive integer more than or equal to 1;
The M and the N ratio are defined as to the hiding probability of the hidden layer.
Alternatively, the determining module includes:
Second acquisition submodule, it is described for obtaining the output valve of the first hidden layer and the output valve of the second hidden layer
First hidden layer is the hidden layer that output valve level of abstraction is minimum in the multiple hidden layer, and second hidden layer is described more
Output valve level of abstraction highest hidden layer in individual hidden layer;
Second determination sub-module, for the output valve based on first hidden layer, determine the hidden of first hidden layer
Probability is hidden, based on the output valve of second hidden layer, determines the hiding probability of second hidden layer;
3rd determination sub-module, for hiding for the hiding probability based on first hidden layer and second hidden layer
Probability difference between probability, determine in the multiple hidden layer in addition to first hidden layer and second hidden layer
The hiding probability of other hidden layers.
Alternatively, first hidden layer is first hidden layer be connected with input layer, second hidden layer for
Last hidden layer of output layer connection;
3rd determination sub-module is used for:
Number based on the multiple hidden layer, the probability difference, first hidden layer hiding probability and
The hiding probability of last hidden layer, it is determined that between first hidden layer and last described hidden layer
Each hidden layer hiding probability.
Alternatively, the selecting module includes:
4th determination sub-module, for each node in multiple nodes for including for the hidden layer, according to default
Rule generates a random chance for the node;
5th determination sub-module, for when the random chance is less than the hiding probability, the node to be defined as
Destination node.
According to the third aspect of the embodiment of the present disclosure, there is provided a kind of device of training convolutional neural networks model, the dress
Put including:
Processor;
For storing the memory of processor-executable instruction;
Wherein, the processor is configured as the step of any one method described in above-mentioned first aspect.
According to the fourth aspect of the embodiment of the present disclosure, there is provided a kind of computer-readable recording medium, it is described computer-readable
Instruction is stored with storage medium, any one method described in above-mentioned first aspect is realized in the instruction when being executed by processor
Step.
The technical scheme provided by this disclosed embodiment can include the following benefits:For convolutional neural networks model
Including multiple hidden layers in each hidden layer, the multiple nodes included by the hiding probability of the hidden layer from the hidden layer
Middle selection target node, and convolutional neural networks model is trained according to the destination node selected from multiple hidden layers,
Wherein, the hiding probability of the plurality of hidden layer differs.Because different hidden layers corresponds to different input values, therefore, when adopting
When with different hiding probability, selection target node is trained from different hidden layers, compared in correlation technique to all
Hidden layer is hidden probability using identical and is trained, and the image recognition that can effectively improve convolutional neural networks model is accurate
True rate.
It should be appreciated that the general description and following detailed description of the above are only exemplary and explanatory, not
The disclosure can be limited.
Brief description of the drawings
Accompanying drawing herein is merged in specification and forms the part of this specification, shows the implementation for meeting the present invention
Example, and for explaining principle of the invention together with specification.
Fig. 1 is a kind of Organization Chart of convolutional neural networks model according to an exemplary embodiment.
Fig. 2 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 3 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 4 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 A are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 B are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 C are a kind of determining module block diagrams according to an exemplary embodiment.
Fig. 5 D are the block diagrams of another determining module according to an exemplary embodiment.
Fig. 6 is a kind of block diagram of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 7 is a kind of block diagram of the device of training convolutional neural networks model according to an exemplary embodiment.
Embodiment
To make the purpose, technical scheme and advantage of the disclosure clearer, below in conjunction with accompanying drawing to disclosure embodiment party
Formula is described in further detail.
Before detailed explanation is carried out to the embodiment of the present disclosure, the application scenarios that are first related to the embodiment of the present disclosure
Introduced.
One kind that convolutional neural networks model refers to grow up on the basis of traditional multilayer neural network is for figure
As the neutral net of classification and identification, relative to traditional multilayer neural network, convolution is introduced in convolutional neural networks model
Algorithm and pond algorithm., it is necessary to convolutional Neural before image is classified and identified using convolutional neural networks model
Network model is trained.Wherein, when being trained to convolutional neural networks model, training sample can be concentrated multiple
Sample inputs the convolutional neural networks model, forward calculation is carried out, so as to obtain the defeated of each layer of the convolutional neural networks model
Go out value.Afterwards, can be carried out according to the output valve of last layer of the convolutional neural networks model and the output valve of remainder layer
Backwards calculation, it is updated with the parameter of the node to each layer.
During training, if the sample data quantity that training sample is concentrated is very little, then, trained convolution
The fitting effect for the sample size that neural network model is concentrated to the training sample may be very accurate, still, when passing through the volume
Product neural network model is to test data when being fitted, then it is possible that the excessively poor phenomenon of precision, this phenomenon are
Over-fitting.If convolutional neural networks model over-fitting, then, using the image recognition accuracy rate of the convolutional neural networks model
Also can be greatly reduced.
Currently, in order to prevent above-mentioned over-fitting, Dropout technologies can be used to convolutional neural networks model
It is trained.Dropout technologies are proposed by the father Hinton of deep learning earliest, when using Dropout technique drills convolution god
During through network model, each hidden layer in the multiple hidden layers included for the convolutional neural networks model, once
During repetitive exercise, multiple destination nodes in the hidden layer can be selected by a predetermined probabilities, and to the more of selection
The parameter of individual destination node is updated.And for remaining non-selected node in the hidden layer, trained in current iteration
During be then considered as and hidden, that is to say, to the parameter of non-selected node in the hidden layer temporarily without renewal.When
When being iterated training again by other training samples, then again through predetermined probabilities selection target node.So, due to every
Node selected by secondary repetitive exercise is different from, and therefore, the convolutional neural networks model that repetitive exercise obtains each time is also
It is different.Due to the node included when being trained using Dropout technologies to convolutional neural networks model to each hidden layer
Accepted or rejected, therefore, reduce the coupling between each node in each hidden layer, alleviate over-fitting, Jin Erti
High image recognition accuracy rate.And the method for the training convolutional neural networks model that the embodiment of the present disclosure provides can be used for
State during being trained using Dropout technologies to convolutional neural networks model.
After the application scenarios of the embodiment of the present disclosure are introduced, the volume that is next related to the embodiment of the present disclosure
The basic framework of product neural network model is introduced.
Fig. 1 is a kind of framework for convolutional neural networks model that the embodiment of the present disclosure provides.As shown in figure 1, the convolution is refreshing
Include input layer 101, hidden layer 102-105 and output layer 106 through network model, wherein, input layer 101 and the phase of hidden layer 102
Even, hidden layer 102,103,104 and 105 is sequentially connected, and hidden layer 105 is connected with output layer 106.Wherein, input layer can be only
Including a node, multiple nodes can also be included, any hidden layer in hidden layer 102-105, may each comprise multiple sections
Point, output layer 106 can include a node, can also include multiple nodes.As shown in figure 1, in the disclosed embodiments, it is false
If input layer only includes a node, hidden layer 102-105 includes 4 nodes, and output layer 106 can include a node.
Wherein, input layer is used for the pixel value of all pixels point for determining that the image of input includes, and by the institute of the image
The pixel value for having pixel is transmitted to hidden layer 102, wherein, hidden layer 102 can be convolutional layer.When hidden layer 102 receives
After the pixel value of all pixels point, first time convolution algorithm processing is carried out according to the pixel value of all pixels point received,
The pixel after first layer process of convolution is obtained, and the pixel after first time process of convolution is transmitted to hidden layer 103, this
When, the hidden layer 103 can be convolutional layer, sample level either pond layer., can basis when hidden layer 103 is pond layer
The pixel value of pixel after first time process of convolution carries out first time pond algorithm process, after obtaining the processing of first time pondization
Pixel, and the pixel after the processing of first time pondization is transmitted to next hidden layer 104.Now, hidden layer 104 can be
Convolutional layer, the pixel after the hidden layer 104 can be handled according to first time pondization carry out second of convolution algorithm processing, obtained
Pixel after second of process of convolution, and the pixel after second of process of convolution is transmitted to next hidden layer 105.When
When hidden layer 104 is convolutional layer, hidden layer 105 can be pond layer.Hidden layer 105 can be according to second of the convolution received
Pixel after processing carries out second of pond algorithm process, obtains the pixel after second of pondization processing, and will be second
Pixel after pondization processing is transmitted to output layer 106.Generally, output layer 106 is full articulamentum, can be according to the place received
The pixel value of pixel after reason determines that the image is belonging respectively to pre-set the probability of each classification in multiple classifications, from
And obtain the classification of the image.
It should be noted that the hidden layer in convolutional neural networks model can be convolutional layer, sample level, pond layer or complete
Articulamentum.The framework of the convolutional neural networks model of above-mentioned offer is only a kind of possible convolution god that the embodiment of the present disclosure provides
Framework through network model, the restriction to the embodiment of the present disclosure is not formed.
After the framework of embodiment of the present disclosure application scenarios and the convolutional neural networks model being related to is introduced,
Next the implementation of the training convolutional neural networks model provided the embodiment of the present disclosure carries out detailed explanation.
Fig. 2 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment,
As shown in Fig. 2 the method for the training convolutional neural networks model can be used in terminal, it can be used in server, the party
Method comprises the following steps:
In step 201, each hidden layer in the multiple hidden layers included for convolutional neural networks model, based on this
Selection target node in multiple nodes that the hiding probability of hidden layer includes from the hidden layer, the hiding probability of the plurality of hidden layer
Differ.
In step 202, the convolutional neural networks model is entered based on the destination node selected from the plurality of hidden layer
Row training.
In the disclosed embodiments, each hidden layer in the multiple hidden layers included for convolutional neural networks model,
Selection target node in the multiple nodes included by the hiding probability of the hidden layer from the hidden layer, and implied according to from multiple
The destination node that selects is trained to convolutional neural networks model in layer, wherein, the hiding probability of the plurality of hidden layer not phase
Together.Because different hidden layers corresponds to different input values, therefore, when the different hiding probability of use is from different hidden layers
When selection target node is trained, instructed compared to probability is hidden using identical to all hidden layers in correlation technique
Practice, can effectively improve the image recognition accuracy rate of convolutional neural networks model.
Alternatively, in multiple nodes that the hiding probability based on hidden layer includes from hidden layer before selection target node,
Also include:
Determine the hiding probability of each hidden layer in multiple hidden layers that convolutional neural networks model includes, multiple hidden layers
Hiding probability raised successively according to the level of abstraction order from high to low of multiple hidden layer output valves.
Optionally it is determined that in multiple hidden layers that convolutional neural networks model includes each hidden layer hiding probability, bag
Include:
Each hidden layer in the multiple hidden layers included for convolutional neural networks model, obtains the output of hidden layer
Value;
Output valve based on hidden layer, determine the hiding probability of hidden layer.
Alternatively, the output valve based on hidden layer, the hiding probability of hidden layer is determined, including:
Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is the positive integer more than 1;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains mesh
Mark quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
The m-th singular value in ranking results is determined, wherein, the quadratic sum of the preceding M singular value in ranking results is more than
Target quadratic sum, and the quadratic sum of the preceding M-1 singular value in ranking results is less than target quadratic sum, M is more than or equal to 1
Positive integer;
M and N ratio is defined as to the hiding probability of hidden layer.
Optionally it is determined that in multiple hidden layers that convolutional neural networks model includes each hidden layer hiding probability, bag
Include:
The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, the first hidden layer is in multiple hidden layers
The minimum hidden layer of output valve level of abstraction, the second hidden layer are that output valve level of abstraction highest implies in multiple hidden layers
Layer;
Based on the output valve of the first hidden layer, the hiding probability of the first hidden layer, the output based on the second hidden layer are determined
Value, determine the hiding probability of the second hidden layer;
Hiding probability based on the first hidden layer and the probability difference between the hiding probability of the second hidden layer, are determined multiple
The hiding probability of other hidden layers in hidden layer in addition to the first hidden layer and the second hidden layer.
Alternatively, the first hidden layer is first hidden layer being connected with input layer, and the second hidden layer is to connect with output layer
Last hidden layer connect;
Hiding probability based on the first hidden layer and the probability difference between the hiding probability of the second hidden layer, are determined multiple
The hiding probability of other hidden layers in hidden layer in addition to the first hidden layer and the second hidden layer, including:
Number based on multiple hidden layers, probability difference, first hidden layer hiding probability and last is implicit
The hiding probability of layer, it is determined that the hiding probability of each hidden layer between first hidden layer and last hidden layer.
Alternatively, selection target node in multiple nodes that the hiding probability based on hidden layer includes from hidden layer, including:
Each node in the multiple nodes included for hidden layer, it is that the node generates one at random according to preset rules
Probability;
When random chance, which is less than, hides probability, node is defined as destination node.
Above-mentioned all optional technical schemes, can form the alternative embodiment of the disclosure according to any combination, and the disclosure is real
Example is applied no longer to repeat this one by one.
When training convolutional neural networks model, multiple hidden layers for including for the convolutional neural networks model can be with
Selection target node in the multiple nodes included using different hiding probability from each hidden layer, and according to the target section of selection
Point is trained to the convolutional neural networks model.And before selection target node, it may be determined that each in multiple hidden layers
The hiding probability of hidden layer.
It should be noted that in convolutional neural networks model, the level of abstraction of the output valve of each hidden layer is different.It is logical
Often, according to the order from input layer to output layer, closer to the hidden layer of output layer, output valve closer to classification information, namely
It is that level of abstraction is higher.It is and also lower closer to shape information, level of abstraction closer to the hidden layer of input layer, output valve.
It for the high hidden layer of level of abstraction, can be trained with less hiding probability, and be implied for level of abstraction is lower
Layer, then can be trained with larger hiding probability.
, in the disclosed embodiments, can be according to the plurality of hidden layer for the plurality of hidden layer based on foregoing description
The height of the level of abstraction of output valve, the principle reduced according to the hiding probability of hidden layer with the rise of level of abstraction are come true
The hiding probability of each hidden layer in fixed the plurality of hidden layer.Specifically, can be multiple to determine by two kinds of different methods
The hiding probability of each hidden layer in hidden layer, and then probability is hidden according to this convolutional neural networks model is trained.
Next, the method for the first the training convolutional neural networks model provided with reference to Fig. 3 the embodiment of the present disclosure is carried out detailed
Explanation.
Fig. 3 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
This method can be used in terminal or server, in the disclosed embodiments, will be explained by executive agent of terminal.
, also still can be by the implementation process in following embodiments come to convolutional neural networks model when executive agent is server
It is trained.As shown in figure 3, this method comprises the following steps:
In step 301, each hidden layer in the multiple hidden layers included for convolutional neural networks model, obtain hidden
Output valve containing layer.
In the disclosed embodiments, when being trained to convolutional neural networks model, training sample can be concentrated
Training image carries out forward calculation, that is to say, the training image is inputted from the input layer of the convolutional neural networks model, in
Between multiple hidden layers calculating, identification knot to the training image is finally exported by the output layer of the convolutional neural networks model
Fruit.
During the training image is carried out into forward calculation, according to the order from input layer to output layer, input layer
Output valve using as the input value of first hidden layer, the output valve of last hidden layer is using as the input of output layer
Value.And for two adjacent hidden layers, the output valve of previous hidden layer is using as the input value of next hidden layer.As general
The training image from input layer be transferred to output layer when, for each hidden in multiple hidden layers of the convolutional neural networks model
Containing layer, terminal can obtain the output valve of each hidden layer.
In step 302, the output valve based on each hidden layer, it is determined that the hiding probability of each hidden layer, multiple implicit
Layer hiding probability differ, and the hiding probability of multiple hidden layers according to multiple hidden layer output valves level of abstraction from height to
Low order raises successively.
After the output valve of each hidden layer is got, terminal can be by the output valve of each hidden layer, under
Method is stated to determine the hiding probability of each hidden layer.Wherein, the hiding probability as subsequently includes multiple to the hidden layer
Node carries out the foundation of selection.
Wherein, for each hidden layer in multiple hidden layers, the output valve based on the hidden layer, the hidden layer is determined
Hiding the operation of probability can be:Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is more than 1
Positive integer;The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains target
Quadratic sum;N number of singular value is ranked up according to order from big to small, obtains ranking results;Determine the M in ranking results
Individual singular value, wherein, the quadratic sum of the preceding M singular value in ranking results is more than target quadratic sum, and in ranking results before
The quadratic sum of M-1 singular value is less than target quadratic sum, and M is the positive integer more than or equal to 1;M and N ratio is defined as hidden
Hiding probability containing layer.
It should be noted that after the output valve of the hidden layer is got, the output valve can be subjected to singular value point
Solution, obtains N number of singular value.Afterwards, the quadratic sum of N number of singular value can be calculated, and calculates target according to preset ratio and puts down
Fang He, for example, it is assumed that the quadratic sum of N number of singular value is R, preset ratio k, then, the target quadratic sum will be W=k*R.Its
In, the preset ratio can be 80%, or 70%, can also be other numerical value.
When it is determined that after target quadratic sum, terminal can be carried out obtained N number of singular value according to order from big to small
Sequence, obtains ranking results.Afterwards, terminal it is strange can to calculate this first since first singular value in the ranking results
Square of different value, and whether judge the singular value square is more than or equal to target quadratic sum.If the singular value is square small
In target quadratic sum, then, terminal can be by first singular value in the ranking results square plus the second singular value
Square, the quadratic sum of the first two singular value in the ranking results is obtained, and judge whether the quadratic sum of the first two singular value is more than
Or equal to target quadratic sum.According to above-mentioned method, terminal can constantly calculate continuous judgement, until the sequence being calculated
As a result when the quadratic sum of preceding M singular value is more than or equal to target quadratic sum in, that is, stop calculate, and using M and N ratio as
The hiding probability of the hidden layer.
For example, it is assumed that 10 singular values that the output valve progress singular value decomposition to hidden layer obtains, that is to say, N=10.
The quadratic sum for calculating 10 singular values is R, preset ratio 80%, therefore, target quadratic sum W=80%*R.It is strange by this 10
Different value is ranked up, and obtained ranking results are:(n1, n2, n3, n4, n5, n6, n7, n8, n9, n10).Afterwards, terminal calculates firstAnd judgeWhether W is more than or equal to, ifMore than or equal to W, then, M is 1, and now, the hidden layer is hidden
Probability will be M/N=0.1.IfLess than W, now, terminal can calculateAnd continue to judgeIt is whether big
In or equal to W.IfLess than W, then, terminal will continue to calculateBy that analogy, until calculating
When the quadratic sum arrived is more than or equal to W, that is, stop calculating.Assuming that arrived when terminal calculatesWhen, really
It is fixedMore than or equal to target quadratic sum, now, you can determine M=6, and it is implicit that this is calculated
The hiding probability of layer is M/N=0.6.
In step 303, selection target section in multiple nodes that the hiding probability based on hidden layer includes from the hidden layer
Point.
When it is determined that hidden layer hiding probability after, each node in the multiple nodes included for the hidden layer, eventually
End can be that the node generates a random chance according to preset rules, when the random chance, which is less than, hides probability, then should
Node is defined as destination node.
Wherein, it is random can be defined as node generation one by Bernoulli Jacob's function either Binomial Distributing Function for terminal
Probability, afterwards, by the random chance compared with the hiding probability of the hidden layer, so as to judge in follow-up training process
Whether the parameter of the node is updated.
For example, it is assumed that the hidden layer includes 4 nodes, it is determined that the hiding probability of the hidden layer be 0.6.For node 1,
Assuming that being 0.4 by the random chance of Bernoulli Jacob's function either node that Binomial Distributing Function determines, it is less than due to 0.4
0.6, now, you can so that the node 1 is defined as into destination node, that is to say, in follow-up training process, it is necessary to the node
1 parameter is updated.For node 2, it is assumed that its random chance is 0.7, and because 0.7 is more than 0.6, now, the node 2 will not
It can be that is to say, in follow-up training process, the node 2 will be temporarily hidden, not to the node 2 by as destination node
Parameter is updated.For remaining two nodes, can be determined whether by above-mentioned method as target section
Point.
For each hidden layer, terminal can select at least one destination node therefrom by the above method, when true
After having determined the destination node corresponding to each hidden layer, following terminal can be by the method in step 304 to the convolution
Neural network model is trained.
In step 304, convolutional neural networks model is instructed based on the destination node selected from multiple hidden layers
Practice.
For each hidden layer, terminal can therefrom select to obtain destination node by the above method, when terminal determines
Multiple hidden layers respectively corresponding to after destination node, then can be by the destination node of selection to the convolutional neural networks mould
Type is trained.
When terminal determines the destination node of multiple hidden layers, and the forward calculation in training process has been completed, eventually
End can start backwards calculation.When carrying out backwards calculation, you can with the output valve according to output layer during forward calculation, to choosing
The parameter for the destination node selected is updated, so as to complete the training to convolutional neural networks model.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes
Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve
Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated
Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer
Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values,
Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden
Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate
Rate.
It should be noted that being experimentally confirmed, determine to hide generally according to the method for above-mentioned offer in the embodiment of the present disclosure
Rate, and probability training convolutional neural networks model is hidden according to this, convolutional neural networks model after using the training is carried out
During image recognition, enter compared to the convolutional neural networks model for using the predetermined probabilities by 0.5 to train to obtain in correlation technique
Row image recognition, the accuracy rate of identification rise to 0.92 from 0.86, and lifting effect is clearly.
The method that the first training convolutional neural networks model is described by above-described embodiment, next, Fig. 4 will be combined
The method of second of the training convolutional neural networks model provided the embodiment of the present disclosure is introduced.
Fig. 4 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment,
This method can be used in terminal or server, will be explained in the embodiment of the present disclosure by terminal of executive agent,
In actual applications, also still can be by the implementation process in following embodiments come to volume when executive agent is server
Product neural network model is trained.As shown in figure 4, this method comprises the following steps:
In step 401, the output valve of the first hidden layer and the output valve of the second hidden layer, first hidden layer are obtained
For the hidden layer that output valve level of abstraction in multiple hidden layers is minimum, second hidden layer is that output valve is abstracted in multiple hidden layers
Degree highest hidden layer.
In the disclosed embodiments, in order to reduce amount of calculation, terminal can only minimum to output valve level of abstraction first
The output valve of hidden layer and output valve level of abstraction the second hidden layer of highest is analyzed, to obtain corresponding hide generally respectively
Rate, and for other hidden layers, then its output valve can not be analyzed, determine that it is hidden using easier algorithm
Probability.Therefore, in this step, terminal can only obtain the output valve of the first hidden layer and the output valve of the second hidden layer.
It should be noted that in current convolutional neural networks model, generally, first be connected with input layer is implicit
The level of abstraction of the output valve of layer is minimum, and the level of abstraction of the output valve for last hidden layer being connected with output layer
Often highest, therefore, terminal can directly obtain the output valve for first hidden layer being connected with input layer as first
The output valve of hidden layer, and the output using the output valve for last hidden layer being connected with output layer as the second hidden layer
Value.
In step 402, the output valve based on the first hidden layer, the hiding probability of the first hidden layer is determined, based on second
The output valve of hidden layer, determine the hiding probability of the second hidden layer.
When getting the output valve of the first hidden layer, terminal can be by way of in step 302 in previous embodiment
To determine to obtain the hiding probability of the first hidden layer.Similarly, when obtaining the hiding probability of the second hidden layer, terminal similarly may be used
The hiding probability of the second hidden layer is determined in a manner of by determining to hide probability in step 302.The embodiment of the present disclosure is at this
In repeat no more.
In step 403, the probability between the hiding probability based on the first hidden layer and the hiding probability of the second hidden layer
Difference, determine the hiding probability of other hidden layers in multiple hidden layers in addition to the first hidden layer and the second hidden layer.
After it is determined that obtaining the hiding probability of the first hidden layer and the second hidden layer, terminal can calculate the first hidden layer
Hiding probability and the second hidden layer hiding probability between probability difference;Afterwards, terminal can be based on the probability difference come
Determine the hiding probability of remaining hidden layer in the plurality of hidden layer.
Wherein, terminal can be arranged the level of abstraction of the output valve of multiple hidden layers according to order from low to high
Sequence, ranking results are obtained, now, first hidden layer in the ranking results is the first hidden layer, and in the ranking results
Last hidden layer be the second hidden layer.Afterwards, terminal can the number based on multiple hidden layers, probability difference,
The hiding probability of one hidden layer and the hiding probability of the second hidden layer, determine in the ranking results positioned at first hidden layer and
The hiding probability of each hidden layer between last hidden layer.
Wherein, it is calculated when terminal general between the hiding probability of the first hidden layer and the hiding probability of the second hidden layer
After rate difference, terminal can determine the number K for the hidden layer that the convolutional neural networks model includes, and calculate the probability difference
Ratio between K-1.Afterwards, the hiding probability of the first hidden layer can be subtracted the ratio by terminal, obtain the ranking results
In second hidden layer hiding probability, by that analogy.It that is to say, second for being located at the first hidden layer in the ranking results
Each hidden layer between hidden layer, the hiding probability of the hidden layer can be subtracted the ratio, so as to obtain the ranking results
In be located at the hidden layer after next hidden layer hiding probability.
It should be noted that in current convolutional neural networks model, the level of abstraction of the output valve of multiple hidden layers
It is often elevated successively according to the order from input layer to output layer, it that is to say, first be connected with input layer hidden layer
The level of abstraction of output valve be minimum, the level of abstraction of the output valve for the next hidden layer being connected with first hidden layer
Then it is higher than first hidden layer, by that analogy, the level of abstraction of the output valve for last hidden layer being connected with output layer is most
It is high.Therefore, may not necessarily be implied for such convolutional neural networks model, terminal according still further to the level of abstraction of output valve to multiple
Layer is ranked up.That is to say, terminal can directly determine first hidden layer being connected with input layer hiding probability and with
The hiding probability of last hidden layer of output layer connection, and calculate between first hidden layer and last hidden layer
Probability difference, afterwards, the number based on the plurality of hidden layer, the probability difference, the hiding probability of first hidden layer and last
The hiding probability of one hidden layer, calculate in the convolutional neural networks model positioned at first hidden layer and last hidden layer
Between each hidden layer hiding probability.
For example, it is assumed that the hiding probability for first hidden layer being connected in convolutional neural networks model with input layer is a, with
The hiding probability of last hidden layer of output layer connection is b, the number for the hidden layer that the convolutional neural networks model includes
For 10, then, terminal can be calculated between the hiding probability a of first hidden layer and the hiding probability b of last hidden layer
Probability difference a-b, and ratio (a-b)/(K-1) is calculated.Afterwards, it is next hidden for being connected with first hidden layer
Containing layer, the hiding probability of the hidden layer is a- (a-b)/(K-1), and the next hidden layer being connected with the hidden layer is hiding
Probability is then a-2* (a-b)/(K-1), by that analogy, can be calculated positioned at first hidden layer and last hidden layer
Between each hidden layer hiding probability.
In step 404, the multiple sections included for each hidden layer, the hiding probability based on the hidden layer from hidden layer
Selection target node in point.
After the hiding probability of each hidden layer is determined, for each hidden layer, terminal may be referred to foregoing implementation
Step 303 in example, selection target node in the multiple nodes included by the hiding probability of the hidden layer from the hidden layer.This
Open embodiment repeats no more herein.
In step 405, convolutional neural networks model is instructed based on the destination node selected from multiple hidden layers
Practice.
This step may be referred to the implementation in step 304, the embodiment of the present disclosure in previous embodiment and repeat no more.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes
Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve
Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated
Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer
Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values,
Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden
Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate
Rate.
In addition, in the present embodiment, terminal can only analysis output valve level of abstraction highest hidden layer output valve with
And the output valve of the minimum hidden layer of level of abstraction, remaining hidden layer may not necessarily then do output valve analysis again, according to hiding
The principle that probability is raised and reduced with level of abstraction is determined, and reduces the amount of calculation of terminal.
It is next right after the method for the training convolutional neural networks model provided the embodiment of the present disclosure is introduced
The device for the training convolutional neural networks model that the embodiment of the present disclosure provides is introduced.
Fig. 5 A are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.Ginseng
According to Fig. 5 A, the device includes determining module 501, selecting module 502 and training module 503.
Selecting module 501, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, base
The selection target node in multiple nodes that the hiding probable value of the hidden layer includes from the hidden layer, multiple hidden layers are hidden
Probability differs;
Training module 502, for being entered based on the destination node selected from multiple hidden layers to convolutional neural networks model
Row training.
Alternatively, also include referring to Fig. 5 B, the device:
Determining module 503, each hidden layer is hidden in the multiple hidden layers included for determining convolutional neural networks model
Hide probability, the hiding probability of multiple hidden layers rises successively according to the level of abstraction order from high to low of multiple hidden layer output valves
It is high.
Alternatively, include referring to Fig. 5 C, the determining module 503:
First acquisition submodule 5031, for each hidden in multiple hidden layers for including for convolutional neural networks model
Containing layer, the output valve of hidden layer is obtained;
First determination sub-module 5032, for the output valve based on hidden layer, determine the hiding probability of hidden layer.
Alternatively, the first determination sub-module is used for:
Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is the positive integer more than 1;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains mesh
Mark quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
The m-th singular value in ranking results is determined, wherein, the quadratic sum of the preceding M singular value in ranking results is more than
Target quadratic sum, and the quadratic sum of the preceding M-1 singular value in ranking results is less than target quadratic sum, M is more than or equal to 1
Positive integer;
M and N ratio is defined as to the hiding probability of hidden layer.
Alternatively, include referring to Fig. 5 D, the determining module 503:
Second acquisition submodule 5033, for obtaining the output valve of the first hidden layer and the output valve of the second hidden layer,
First hidden layer is the hidden layer that output valve level of abstraction is minimum in multiple hidden layers, and the second hidden layer is defeated in multiple hidden layers
Go out to be worth level of abstraction highest hidden layer;
Second determination sub-module 5034, for the output valve based on the first hidden layer, determine the hiding general of the first hidden layer
Rate, based on the output valve of the second hidden layer, determine the hiding probability of the second hidden layer;
3rd determination sub-module 5035, for hiding probability and the hiding probability of the second hidden layer based on the first hidden layer
Between probability difference, determine hiding for other hidden layers in multiple hidden layers in addition to the first hidden layer and the second hidden layer
Probability.
Alternatively, the first hidden layer is first hidden layer being connected with input layer, and the second hidden layer is to connect with output layer
Last hidden layer connect;
3rd determination sub-module is used for:
Number based on multiple hidden layers, probability difference, first hidden layer hiding probability and last is implicit
The hiding probability of layer, it is determined that the hiding probability of each hidden layer between first hidden layer and last hidden layer.
Alternatively, selecting module 502 includes:
4th determination sub-module, for each node in multiple nodes for including for hidden layer, according to preset rules
A random chance is generated for node;
5th determination sub-module, for when random chance is less than and hides probability, the node to be defined as into destination node.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes
Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve
Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated
Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer
Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values,
Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden
Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate
Rate.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 6 is a kind of device 600 for training convolutional neural networks model according to an exemplary embodiment
Block diagram.For example, device 600 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console,
Tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Reference picture 6, device 600 can include following one or more assemblies:Processing component 602, memory 604, power supply
Component 606, multimedia groupware 608, audio-frequency assembly 610, the interface 612 of input/output (I/O), sensor cluster 614, and
Communication component 616.
The integrated operation of the usual control device 600 of processing component 602, such as communicated with display, call, data, phase
The operation that machine operates and record operation is associated.Processing component 602 can refer to including one or more processors 620 to perform
Order, to complete all or part of step of above-mentioned method.In addition, processing component 602 can include one or more modules, just
Interaction between processing component 602 and other assemblies.For example, processing component 602 can include multi-media module, it is more to facilitate
Interaction between media component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in device 600.These data are shown
Example includes the instruction of any application program or method for being operated on device 600, contact data, telephone book data, disappears
Breath, picture, video etc..Memory 604 can be by any kind of volatibility or non-volatile memory device or their group
Close and realize, as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) are erasable to compile
Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 606 provides power supply for the various assemblies of device 600.Power supply module 606 can include power management system
System, one or more power supplys, and other components associated with generating, managing and distributing power supply for device 600.
Multimedia groupware 608 is included in the screen of one output interface of offer between described device 600 and user.One
In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings
Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action
Border, but also detect and touched or the related duration and pressure of slide with described.In certain embodiments, more matchmakers
Body component 608 includes a front camera and/or rear camera.When device 600 is in operator scheme, such as screening-mode or
During video mode, front camera and/or rear camera can receive outside multi-medium data.Each front camera and
Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio-frequency assembly 610 is configured as output and/or input audio signal.For example, audio-frequency assembly 610 includes a Mike
Wind (MIC), when device 600 is in operator scheme, during such as call model, logging mode and speech recognition mode, microphone by with
It is set to reception external audio signal.The audio signal received can be further stored in memory 604 or via communication set
Part 616 is sent.In certain embodiments, audio-frequency assembly 610 also includes a loudspeaker, for exports audio signal.
I/O interfaces 612 provide interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock
Determine button.
Sensor cluster 614 includes one or more sensors, and the state for providing various aspects for device 600 is commented
Estimate.For example, sensor cluster 614 can detect opening/closed mode of device 600, and the relative positioning of component, for example, it is described
Component is the display and keypad of device 600, and sensor cluster 614 can be with 600 1 components of detection means 600 or device
Position change, the existence or non-existence that user contacts with device 600, the orientation of device 600 or acceleration/deceleration and device 600
Temperature change.Sensor cluster 614 can include proximity transducer, be configured to detect in no any physical contact
The presence of neighbouring object.Sensor cluster 614 can also include optical sensor, such as CMOS or ccd image sensor, for into
As being used in application.In certain embodiments, the sensor cluster 614 can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device
600 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary implementation
In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 616 also includes near-field communication (NFC) module, to promote junction service.Example
Such as, in NFC module radio frequency identification (RFID) technology can be based on, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 600 can be believed by one or more application specific integrated circuits (ASIC), numeral
Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for performing above-mentioned Fig. 2-4 illustrated embodiments
The method of offer.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided
Such as include the memory 604 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 620 of device 600.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is held by the processor of terminal
During row so that the training convolutional neural networks model that terminal is able to carry out a kind of above-mentioned Fig. 2, Fig. 3 and embodiment illustrated in fig. 4 provides
Method.
Fig. 7 is a kind of device 700 for training convolutional neural networks model according to an exemplary embodiment
Block diagram.For example, device 700 may be provided in a server.Reference picture 7, device 700 include processor 722, and it is further wrapped
One or more processors are included, and as the memory resource representated by memory 732, can be by processor 722 for storing
The instruction of execution, such as application program.The application program stored in memory 732 can include one or more each
The individual module for corresponding to one group of instruction.In addition, processor 722 is configured as execute instruction, implemented with performing shown in above-mentioned Fig. 2-4
The method that example provides.
Device 700 can also include the power management that a power supply module 726 is configured as performs device 700, and one has
Line or radio network interface 750 are configured as device 700 being connected to network, and input and output (I/O) interface 758.Dress
Putting 700 can operate based on the operating system for being stored in memory 732, such as Windows ServerTM, Mac OS XTM,
UnixTM, LinuxTM, FreeBSDTM or similar.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided
Such as include the memory 732 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 722 of device 700.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processor of server
During execution so that the training convolutional neural networks model that server is able to carry out above-mentioned Fig. 2, Fig. 3 and embodiment illustrated in fig. 4 provides
Method.
Those skilled in the art will readily occur to the present invention its after considering specification and putting into practice invention disclosed herein
Its embodiment.The application be intended to the present invention any modification, purposes or adaptations, these modifications, purposes or
Person's adaptations follow the general principle of the present invention and including the undocumented common knowledges in the art of the disclosure
Or conventional techniques.Description and embodiments are considered only as exemplary, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and
And various modifications and changes can be being carried out without departing from the scope.The scope of the present invention is only limited by appended claim.
Claims (16)
- A kind of 1. method of training convolutional neural networks model, it is characterised in that methods described includes:Each hidden layer in the multiple hidden layers included for convolutional neural networks model, it is hiding general based on the hidden layer Selection target node in multiple nodes that rate includes from the hidden layer, the hiding probability of the multiple hidden layer differ;The convolutional neural networks model is trained based on the destination node selected from the multiple hidden layer.
- 2. according to the method for claim 1, it is characterised in that it is described based on the hiding probability of the hidden layer from described hidden In the multiple nodes included containing layer before selection target node, in addition to:The hiding probability of each hidden layer in multiple hidden layers that the convolutional neural networks model includes is determined, it is the multiple hidden Hiding probability containing layer raises successively according to the level of abstraction order from high to low of the multiple hidden layer output valve.
- 3. according to the method for claim 2, it is characterised in that the determination convolutional neural networks model includes more The hiding probability of each hidden layer in individual hidden layer, including:Each hidden layer in the multiple hidden layers included for the convolutional neural networks model, obtains the defeated of the hidden layer Go out value;Based on the output valve of the hidden layer, the hiding probability of the hidden layer is determined.
- 4. according to the method for claim 3, it is characterised in that the output valve based on the hidden layer, it is determined that described The hiding probability of hidden layer, including:Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is the positive integer more than 1;The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, is obtained To target quadratic sum;N number of singular value is ranked up according to order from big to small, obtains ranking results;The m-th singular value in the ranking results is determined, wherein, the quadratic sum of the preceding M singular value in the ranking results More than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is less than the target quadratic sum, The M is the positive integer more than or equal to 1;The M and the N ratio are defined as to the hiding probability of the hidden layer.
- 5. according to the method for claim 2, it is characterised in that the determination convolutional neural networks model includes more The hiding probability of each hidden layer in individual hidden layer, including:The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, first hidden layer is the multiple implicit The minimum hidden layer of output valve level of abstraction in layer, second hidden layer are output valve level of abstraction in the multiple hidden layer Highest hidden layer;Based on the output valve of first hidden layer, the hiding probability of first hidden layer is determined, it is implicit based on described second The output valve of layer, determine the hiding probability of second hidden layer;Hiding probability based on first hidden layer and the probability difference between the hiding probability of second hidden layer, it is determined that The hiding probability of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer.
- 6. according to the method for claim 5, it is characterised in that first hidden layer is first be connected with input layer Hidden layer, second hidden layer are last hidden layer being connected with output layer;Probability difference between the hiding probability based on first hidden layer and the hiding probability of second hidden layer, Determine the hiding general of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer Rate, including:Number based on the multiple hidden layer, the probability difference, the hiding probability of first hidden layer and described The hiding probability of last hidden layer, it is determined that every between first hidden layer and last described hidden layer The hiding probability of individual hidden layer.
- 7. according to any described methods of claim 1-6, it is characterised in that it is described based on the hiding probability of the hidden layer from Selection target node in multiple nodes that the hidden layer includes, including:Each node in the multiple nodes included for the hidden layer, according to preset rules be the node generate one with Machine probability;When the random chance is less than the hiding probability, the node is defined as destination node.
- 8. a kind of device of training convolutional neural networks model, it is characterised in that described device includes:Selecting module, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, based on described Selection target node in multiple nodes that the hiding probability of hidden layer includes from the hidden layer, the multiple hidden layer are hidden Probability differs;Training module, for being entered based on the destination node selected from the multiple hidden layer to the convolutional neural networks model Row training.
- 9. device according to claim 8, it is characterised in that described device also includes:Determining module, each hidden layer is hiding general in the multiple hidden layers included for determining the convolutional neural networks model Rate, the hiding probability of the multiple hidden layer according to the multiple hidden layer output valve level of abstraction order from high to low according to Secondary rise.
- 10. device according to claim 9, it is characterised in that the determining module includes:First acquisition submodule, for each implicit in multiple hidden layers for including for the convolutional neural networks model Layer, obtain the output valve of the hidden layer;First determination sub-module, for the output valve based on the hidden layer, determine the hiding probability of the hidden layer.
- 11. device according to claim 10, it is characterised in that first determination sub-module is used for:Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is the positive integer more than 1;The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, is obtained To target quadratic sum;N number of singular value is ranked up according to order from big to small, obtains ranking results;The m-th singular value in the ranking results is determined, wherein, the quadratic sum of the preceding M singular value in the ranking results More than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is less than the target quadratic sum, The M is the positive integer more than or equal to 1;The M and the N ratio are defined as to the hiding probability of the hidden layer.
- 12. device according to claim 9, it is characterised in that the determining module includes:Second acquisition submodule, for obtain the first hidden layer output valve and the second hidden layer output valve, described first Hidden layer is the hidden layer that output valve level of abstraction is minimum in the multiple hidden layer, and second hidden layer is the multiple hidden Containing output valve level of abstraction highest hidden layer in layer;Second determination sub-module, for the output valve based on first hidden layer, determine the hiding general of first hidden layer Rate, based on the output valve of second hidden layer, determine the hiding probability of second hidden layer;3rd determination sub-module, for the hiding probability based on first hidden layer and the hiding probability of second hidden layer Between probability difference, determine other in the multiple hidden layer in addition to first hidden layer and second hidden layer The hiding probability of hidden layer.
- 13. device as claimed in claim 12, it is characterised in that first hidden layer is first be connected with input layer Hidden layer, second hidden layer are last hidden layer being connected with output layer;3rd determination sub-module is used for:Number based on the multiple hidden layer, the probability difference, the hiding probability of first hidden layer and described The hiding probability of last hidden layer, it is determined that every between first hidden layer and last described hidden layer The hiding probability of individual hidden layer.
- 14. according to any described devices of claim 8-13, it is characterised in that the selecting module includes:4th determination sub-module, for each node in multiple nodes for including for the hidden layer, according to preset rules A random chance is generated for the node;5th determination sub-module, for when the random chance is less than the hiding probability, the node to be defined as into target Node.
- 15. a kind of device of training convolutional neural networks model, it is characterised in that described device includes:Processor;For storing the memory of processor-executable instruction;Wherein, the processor is configured as the step of any one method described in claim 1-7.
- 16. a kind of computer-readable recording medium, instruction is stored with the computer-readable recording medium, it is characterised in that The step of instruction realizes any one method described in claim 1-7 when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710675297.5A CN107480773B (en) | 2017-08-09 | 2017-08-09 | Method and device for training convolutional neural network model and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710675297.5A CN107480773B (en) | 2017-08-09 | 2017-08-09 | Method and device for training convolutional neural network model and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107480773A true CN107480773A (en) | 2017-12-15 |
CN107480773B CN107480773B (en) | 2020-11-13 |
Family
ID=60598970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710675297.5A Active CN107480773B (en) | 2017-08-09 | 2017-08-09 | Method and device for training convolutional neural network model and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107480773B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109766493A (en) * | 2018-12-24 | 2019-05-17 | 哈尔滨工程大学 | A kind of cross-domain recommended method combining personality characteristics under neural network |
CN110188789A (en) * | 2019-04-16 | 2019-08-30 | 浙江工业大学 | A kind of small sample classification method of medical image based on pretreated model |
CN113496282A (en) * | 2020-04-02 | 2021-10-12 | 北京金山数字娱乐科技有限公司 | Model training method and device |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104517122A (en) * | 2014-12-12 | 2015-04-15 | 浙江大学 | Image target recognition method based on optimized convolution architecture |
CN104751842A (en) * | 2013-12-31 | 2015-07-01 | 安徽科大讯飞信息科技股份有限公司 | Method and system for optimizing deep neural network |
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN104850845A (en) * | 2015-05-30 | 2015-08-19 | 大连理工大学 | Traffic sign recognition method based on asymmetric convolution neural network |
CN105512676A (en) * | 2015-11-30 | 2016-04-20 | 华南理工大学 | Food recognition method at intelligent terminal |
CN106250921A (en) * | 2016-07-26 | 2016-12-21 | 北京小米移动软件有限公司 | Image processing method and device |
CN106250911A (en) * | 2016-07-20 | 2016-12-21 | 南京邮电大学 | A kind of picture classification method based on convolutional neural networks |
CN106548201A (en) * | 2016-10-31 | 2017-03-29 | 北京小米移动软件有限公司 | The training method of convolutional neural networks, image-recognizing method and device |
CN106951848A (en) * | 2017-03-13 | 2017-07-14 | 平安科技(深圳)有限公司 | The method and system of picture recognition |
-
2017
- 2017-08-09 CN CN201710675297.5A patent/CN107480773B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104751842A (en) * | 2013-12-31 | 2015-07-01 | 安徽科大讯飞信息科技股份有限公司 | Method and system for optimizing deep neural network |
CN104517122A (en) * | 2014-12-12 | 2015-04-15 | 浙江大学 | Image target recognition method based on optimized convolution architecture |
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN104850845A (en) * | 2015-05-30 | 2015-08-19 | 大连理工大学 | Traffic sign recognition method based on asymmetric convolution neural network |
CN105512676A (en) * | 2015-11-30 | 2016-04-20 | 华南理工大学 | Food recognition method at intelligent terminal |
CN106250911A (en) * | 2016-07-20 | 2016-12-21 | 南京邮电大学 | A kind of picture classification method based on convolutional neural networks |
CN106250921A (en) * | 2016-07-26 | 2016-12-21 | 北京小米移动软件有限公司 | Image processing method and device |
CN106548201A (en) * | 2016-10-31 | 2017-03-29 | 北京小米移动软件有限公司 | The training method of convolutional neural networks, image-recognizing method and device |
CN106951848A (en) * | 2017-03-13 | 2017-07-14 | 平安科技(深圳)有限公司 | The method and system of picture recognition |
Non-Patent Citations (2)
Title |
---|
MOHAMED ELLEUCH等: "《A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition》", 《PROCEDIA COMPUTER SCIENCE》 * |
黄斌等: "《基于深度卷积神经网络的物体识别算法》", 《计算机应用》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109766493A (en) * | 2018-12-24 | 2019-05-17 | 哈尔滨工程大学 | A kind of cross-domain recommended method combining personality characteristics under neural network |
CN109766493B (en) * | 2018-12-24 | 2022-08-02 | 哈尔滨工程大学 | Cross-domain recommendation method combining personality characteristics under neural network |
CN110188789A (en) * | 2019-04-16 | 2019-08-30 | 浙江工业大学 | A kind of small sample classification method of medical image based on pretreated model |
CN113496282A (en) * | 2020-04-02 | 2021-10-12 | 北京金山数字娱乐科技有限公司 | Model training method and device |
Also Published As
Publication number | Publication date |
---|---|
CN107480773B (en) | 2020-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256555B (en) | Image content identification method and device and terminal | |
CN108121952A (en) | Face key independent positioning method, device, equipment and storage medium | |
CN106446782A (en) | Image identification method and device | |
CN108171254A (en) | Image tag determines method, apparatus and terminal | |
CN107798669A (en) | Image defogging method, device and computer-readable recording medium | |
CN108664989A (en) | Image tag determines method, apparatus and terminal | |
US10701315B2 (en) | Video communication device and video communication method | |
CN107220667A (en) | Image classification method, device and computer-readable recording medium | |
CN106202330A (en) | The determination methods of junk information and device | |
CN109446961B (en) | Gesture detection method, device, equipment and storage medium | |
CN108009600A (en) | Model optimization, quality determining method, device, equipment and storage medium | |
CN106548468B (en) | The method of discrimination and device of image definition | |
CN104243814B (en) | Analysis method, image taking reminding method and the device of objects in images layout | |
CN107492115A (en) | The detection method and device of destination object | |
CN107145904A (en) | Determination method, device and the storage medium of image category | |
CN110009090A (en) | Neural metwork training and image processing method and device | |
CN107133354B (en) | Method and device for acquiring image description information | |
CN107527024A (en) | Face face value appraisal procedure and device | |
CN107845062A (en) | image generating method and device | |
CN107943266A (en) | power consumption control method, device and equipment | |
CN110443366A (en) | Optimization method and device, object detection method and the device of neural network | |
CN107784279A (en) | Method for tracking target and device | |
CN106203306A (en) | The Forecasting Methodology at age, device and terminal | |
CN109635920A (en) | Neural network optimization and device, electronic equipment and storage medium | |
CN107590534A (en) | Train the method, apparatus and storage medium of depth convolutional neural networks model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |