CN108985456A - The number of plies increases and decreases deep learning neural network training method, system, medium and equipment - Google Patents
The number of plies increases and decreases deep learning neural network training method, system, medium and equipment Download PDFInfo
- Publication number
- CN108985456A CN108985456A CN201810823422.7A CN201810823422A CN108985456A CN 108985456 A CN108985456 A CN 108985456A CN 201810823422 A CN201810823422 A CN 201810823422A CN 108985456 A CN108985456 A CN 108985456A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- neural network
- learning neural
- output data
- hidden layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention discloses a kind of number of plies increase and decrease deep learning neural network training method, system, medium and equipment, the described method includes: training input data is inputted current depth learning neural network, the first output data is calculated by current depth learning neural network;Judge whether the first output data is identical as anticipated output data;If not meeting the first preset condition, increase a hidden layer before classifier in current depth learning neural network;Otherwise, test input data is inputted into current depth learning neural network, the second output data is obtained by deep learning neural computing;Judge whether the second output data is identical as legitimate reading data;If not meeting the second preset condition, the previous hidden layer of classifier in current depth learning neural network is deleted;Otherwise, current depth learning neural network is exported.Top layer concept is exactly the concept for being just enough sufficiently to be fitted with output data when the present invention can reach abundant fitting.
Description
Technical field
The present invention relates to a kind of deep learning neural network training method, especially a kind of number of plies increase and decrease deep learning nerve
Network training method, system, medium and equipment belong to neural metwork training field.
Background technique
Existing deep learning technology can be obtained output label by input data and (such as obtain the people's identity card by head portrait
Number, the people's identification card number is for another example obtained by voice), it has in the top-down supervised training stage through tape label data
Supervised training (such as head portrait with identification card number label for another example has the voice of identification card number label).
But it the top-down supervised training of existing deep learning technology or only adjusts between output layer and hidden layer
Network weight or the network weight of all layers of adjusting.When the classification of top layer concept is more than label classification, if only adjusted
Classifier network weight between output layer and hidden layer adjusts classifier if the network structure of classifier is fairly simple repeatedly
The result of network parameter often met this output label and can not meet that output label again, that is to say, that Wu Fashi
Now adequately fitting.If the network structure of classifier is designed extremely complex, such as with the BP neural network of level complexity
As classifier, then the case where will appear over-fitting again, so that certain key features are rejected in fitting, so that right
Classification results are completely correct for sample, but to application when it finds that not right.
As it can be seen that only exercising supervision to level between output layer and hidden layer trains or can be unable to fully fitting or meeting
Over-fitting can all lead to deep learning failure in application.If the network weight for adjusting all layers, can destroy hidden layer again
In cognition weight and generate weight so that obtained concept and scene are no longer entirely the spy derived from input data after adjusting
The phenomenon that seeking peace scene, but the feature and scene being twisted for the needs of output label, equally will appear over-fitting, from
And make the classification results for sample completely correct, but to application when it finds that not right.
Summary of the invention
The first purpose of this invention is the defect in order to solve the above-mentioned prior art, provides a kind of number of plies increase and decrease depth
Learning neural network training method, this method make top layer concept just have the feature letter for distinguishing different sample datas enough
Breath, thus can be completely corresponding with anticipated output data, legitimate reading data, top layer concept is exactly rigid when can reach abundant fitting
It is enough the concept being sufficiently fitted with output data well.
Second object of the present invention is to provide a kind of number of plies increase and decrease deep learning neural metwork training system.
Third object of the present invention is to provide a kind of storage medium.
Fourth object of the present invention is to provide a kind of calculating equipment.
The first purpose of this invention can be reached by adopting the following technical scheme that:
The number of plies increases and decreases deep learning neural network training method, which comprises
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated
Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network
To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first
When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance
If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing
Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second
When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance
If when condition, exporting current deep learning neural network.
It is further, described to increase a hidden layer before classifier in current deep learning neural network, specifically: with
The output of the last one hidden layer is connected with the input for the hidden layer being newly inserted by encoding and decoding network, with the output for the hidden layer being newly inserted into
Input as classifier in current deep learning neural network.
Further, the number of nodes of the hidden layer of the new insertion is less than or equal to the number of nodes of the last one hidden layer.
Further, the previous hidden layer by classifier in current deep learning neural network is deleted, specifically: with
The node of penultimate hidden layer is the input node of classifier in current deep learning neural network.
Further, the number of nodes of the penultimate hidden layer is greater than the number of nodes of the last one hidden layer.
Further, first preset condition includes: that the expection corresponding with training input data of the first output data is defeated
The error rate of data is less than or equal to the first preset threshold out;
The calculating process of the error rate of first output data anticipated output data corresponding with training input data are as follows:
The first output data different quantity of anticipated output data corresponding with training input data is carried out divided by training input data
The sum of test.
Further, second preset condition includes: the second output data true knot corresponding with test input data
The error rate of fruit data is less than or equal to the second preset threshold;
The calculating process of the error rate of second output data legitimate reading data corresponding with test input data are as follows:
The second output data different quantity of legitimate reading data corresponding with test input data is carried out divided by test input data
The sum of test.
Second object of the present invention can be reached by adopting the following technical scheme that:
The number of plies increases and decreases deep learning neural metwork training system, the system comprises:
Training module, for passing through sample training current deep learning neural network;Wherein, the current depth study mind
It include input layer, hidden layer, classifier and output layer through network;
First input module passes through current depth for input data will to be trained to input current deep learning neural network
The first output data is calculated in learning neural network;
First judgment module, for whether judging the first output data anticipated output data corresponding with training input data
It is identical;
Hidden layer increases module, not identical for working as the first output data anticipated output data corresponding with training input data
Discrepancy of quantity close the first preset condition when, in current deep learning neural network before classifier increase a hidden layer;
Second input module is not identical for working as the first output data anticipated output data corresponding with training input data
Quantity when meeting the first preset condition, test input data is inputted into current deep learning neural network, passes through deep learning
Neural computing obtains the second output data;
Second judgment module, for whether judging the second output data legitimate reading data corresponding with test input data
It is identical;
Hidden layer removing module is not identical for working as the second output data legitimate reading data corresponding with test input data
Discrepancy of quantity close the second preset condition when, by current deep learning neural network classifier previous hidden layer delete;
Output module, for when the second output data different number of legitimate reading data corresponding with test input data
When amount meets the second preset condition, current deep learning neural network is exported.
Third object of the present invention can be reached by adopting the following technical scheme that:
Storage medium is stored with program, when described program is executed by processor, realizes above-mentioned number of plies increase and decrease deep learning
Neural network training method.
Fourth object of the present invention can be reached by adopting the following technical scheme that:
Equipment is calculated, including processor and for the memory of storage processor executable program, the processor is held
When the program of line storage storage, above-mentioned number of plies increase and decrease deep learning neural network training method is realized.
The present invention have compared with the existing technology it is following the utility model has the advantages that
1, training input data is inputted deep learning neural network by the present invention, is obtained by deep learning neural computing
To the first output data, if the first output data different quantity of anticipated output data corresponding with training input data is not
Meet preset condition, then increases the number of plies of hidden layer;If the first output data anticipated output number corresponding with training input data
Meet preset condition according to different quantity, then test input data is inputted into deep learning neural network, pass through deep learning
Neural computing obtains the second output data, if the second output data test output data corresponding with test input data
Different discrepancy of quantity closes preset condition, then the number of plies of hidden layer is reduced, by increasing or decreasing the number of plies of hidden layer, until just
Reach abundant fitting, that is, realize the correction of fitting, so that top layer concept just has the spy for distinguishing different sample datas enough
Reference breath, thus can be completely corresponding with anticipated output data, legitimate reading data, when can reach abundant fitting, top layer concept be just
It is the concept for being just enough sufficiently to be fitted with output label.
2, the present invention is when increasing the hidden layer of deep learning neural network, with the output of the last one hidden layer be newly inserted into
The input of hidden layer is connected by encoding and decoding network, and using the output for the hidden layer being newly inserted into as the input of classifier, that is newly inserted into is hidden
The number of nodes of layer is less than or equal to the number of nodes of the last one hidden layer, and the top layer concept for inputting classifier can be made more to take out
As neglecting the feature that can not be mapped with output label, taking out can be with output data energy sufficiently corresponding feature.
3, the present invention is current deep with the node of penultimate hidden layer when reducing the hidden layer of deep learning neural network
The input node of classifier in learning neural network is spent, the number of nodes of penultimate hidden layer is typically greater than the last one hidden layer
Number of nodes, can make in this way input classifier top layer concept specifically, specifically out it is ignored can and output data
It can abundant corresponding feature.
Detailed description of the invention
Fig. 1 is that the number of plies of the embodiment of the present invention 1 increases and decreases the flow chart of deep learning neural network training method.
Fig. 2 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method
The illustraton of model of network.
Fig. 3 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method
Network is inserted into the schematic diagram of new hidden layer.
Fig. 4 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method
The schematic diagram of network deletion hidden layer.
Fig. 5 is that the number of plies of the embodiment of the present invention 2 increases and decreases the structural block diagram of deep learning neural metwork training system.
Specific embodiment
Present invention will now be described in further detail with reference to the embodiments and the accompanying drawings, but embodiments of the present invention are unlimited
In this.
Embodiment 1:
The explanation and establishment process of deep learning are as follows:
From one input in generate one output involved in calculate can by a flow graph (flow graph) come
Indicate: flow graph is a kind of figure that can indicate to calculate, in this figure each node indicate a basic calculating and
The value of one calculating, the result of calculating are applied to the value of the child node of this node.Consider such a set of computations, it can
To be allowed in each node and possible graph structure, and define a family of functions.Input node does not have father node, defeated
Egress does not have child node.
One of this flow graph is special, and attribute is depth (depth): the longest path of an output is input to from one
Length.
Learning structure is regarded as a network, then the core ideas of deep learning is as follows:
Step 1: using unsupervised training from bottom to top
1) successively building monolayer neuronal is first.
2) tuning is carried out using wake-sleep algorithm every layer.One layer is only adjusted every time, is successively adjusted.
This process can be regarded as the process of a feature learning, be to distinguish maximum with traditional neural network
Part.Wake-sleep algorithm: 1) the wake stage: cognitive process passes through the input feature vector (Input) of lower layer and recognizing upwards
Know that (Encoder) weight generates each layer of abstract representation (Code), then generates one by current generation (Decoder) weight
A reconstruction information (Reconstruction) calculates input feature vector and reconstruction information residual error, declines modification interlayer using gradient
Downlink generates (Decoder) weight.Namely " if reality imagines different with me, the generation weight for changing me makes me
The thing of the imagination becomes as reality ".
2) the sleep stage:
Generating process generates the state of lower layer by Upper Concept (Code) and downward generation (Decoder) weight,
Cognition (Encoder) weight is recycled to generate an abstract scene.Using the residual error of initial upper layer concept and newly-built abstract scene,
Decline upward cognition (Encoder) weight of modification interlayer using gradient.Namely " if the scene in dream is not in my brain
Corresponding concepts, changing my cognition weight to make this scene in my view is exactly this concept ".
Step 2: top-down supervised training
This step be first step study obtain each layer parameter on the basis of, add a classification in the coding layer most pushed up
Device (such as Rogers spy's recurrence, SVM etc.) then passes through the supervised training of tape label data, goes to finely tune using gradient descent method
Whole network parameter.
The first step of deep learning is substantially a network parameter initialization procedure.It is different from traditional neural network initial value
Random initializtion, deep learning neural network are obtained by the structure of unsupervised trained input data, thus this initial value
Closer to global optimum, so as to obtain better effect.
It is exercised supervision during training to deep learning neural network, if the number of plies is excessive, top layer can be made general
The details for being excessively abstracted, lacking differentiation is read, and joined " noise " that can just distinguish output label during supervised training
Details is fitted in training so that top layer concept is excessively fitted with label and joined non-area in top layer concept
The noise of other property feature, necessarily leads to that error rate can be increased in the test below.Such as label includes " white man ", " white
Color woman ", " black man ", " black woman ", and top layer conceptual abstraction has arrived " man ", " woman " feature, has ignored " black
Color ", " white " feature, all sample datas can be fitted to " man ", on " woman " this 2 labels at this time, because of depth
Practise using from bottom to top unsupervised training and top-down supervised training, in unsupervised training from bottom to top,
This few class sample data of " white man ", " black man ", it is clear that can be corresponding with " man " top layer concept, " white woman ",
" black woman " this few class sample data, it is clear that can be corresponding with " woman " top layer concept, in top-down supervised training,
This few class label of " white man ", " black man ", it is clear that can be corresponding with " man " top layer concept, " white woman ", " black
This few class label of woman ", it is clear that can be corresponding with " woman " top layer concept, pass through supervised training, deep learning neural network meeting
Adjust automatically network weight eventually makes " man+noise 1 " top layer concept corresponding " black man ", " man+noise 2 " top
Layer concept is corresponding " white man ", and " woman+noise 3 " top layer concept is corresponding " black woman ", " woman+noise 4 " top layer concept
Corresponding " white woman ", due to being fitted repeatedly during supervised training, to be to reach for training data
The effect being sufficiently fitted.But in use, when input " white man " test data, obtained top layer concept " man+noise
2 ", but because noise 2 is not to discriminate between the distinctive feature of white man Yu black man, it is corresponded at this time by classifier
Label may be " white man ", it is also possible to and " black man ", or even can due to the output label that the interference of noise 2 obtains
It can be " white woman " or " black woman ", so that error rate when test will be made to increase.
If the number of plies is very few, top layer concept can be made excessively specific, unrelated or even contradictory with output label differentiation
Details, so that top layer concept cannot be fitted with label, that is to say, that top layer concept and label can not be made to correspond
Come.Such as label includes " man ", " woman ", and the feature of top layer concept is gone back other than including the essential feature of difference men and women
Including hair feature, features of skin colors.Because deep learning using from bottom to top unsupervised training and top-down prison
White silk is supervised and instructed, in top-down supervised training, if just beginning with very much " bob man ", " black men ", " white female
The training sample of people ", " long hair woman ", then will to form " bob man ", " black man ", " white in top layer concept
The concept of color woman ", " long hair woman ", wherein " bob man ", " black man " export mark by classifier corresponding " man "
Label, " white woman ", " long hair woman " are by corresponding " woman " output label of classifier, but if there is a large amount of " long hairs again later
The training sample of man ", " black woman " will make top layer concept be adjusted to " long hair man ", " black man ", " black
The concept of woman ", " long hair woman ", but " bob man ", " white woman " those samples obviously can not be fitted again after adjusting,
So as to cause the adjustment network weight that deep learning neural network is constant with the variation of sample, but it is unable to fully be fitted always,
It is since the network number of plies is inadequate, level of abstraction is insufficient, the details of non-distinguishing characteristics is weeded out not over enough be abstracted, institute
To need to increase the network number of plies.
Therefore, a kind of number of plies increase and decrease deep learning neural network training method is present embodiments provided, this method to push up
Layer concept just has the characteristic information for distinguishing different sample datas enough, thus with anticipated output data, legitimate reading data
Can correspond to completely, when can reach abundant fitting top layer concept be exactly just be enough sufficiently to be fitted with output data it is general
It reads.
As shown in Figure 1, the present embodiment the number of plies increase and decrease deep learning neural network training method the following steps are included:
S101, pass through sample training current deep learning neural network.
Current deep learning neural network obtains: initialization deep learning neural network obtains predetermined depth study nerve
Network, as current deep learning neural network, as shown in Fig. 2, deep learning neural network includes input layer, hidden layer, classification
Device and output layer are wherein classifier between hidden layer and output layer, which is the input of classifier, and output layer is classifier
Output.
By sample training current deep learning neural network, such as: by facial image and nametags, training is obtained
Current deep learning neural network.
S102, training input data is inputted into current deep learning neural network, passes through current deep learning neural network
The first output data is calculated.
Training input data can be facial image, and specifically, training input data can be to be obtained by acquisition, example
Such as: being acquired by camera and obtain facial image;Training input data is also possible to obtain from database lookup, such as: in advance
Facial image is stored up in databases, facial image is searched for from database can be obtained trained input data.
First output data can be nametags, specifically, input current depth by each face training image
Learning neural network, the nametags exported by deep learning neural computing.
S103, judge whether the first output data anticipated output data corresponding with training input data are identical.
Specifically, judge whether the nametags expected nametags corresponding with face training image of output are identical, often
The output of nametags is all recorded in the sum that face training image is tested, namely output name mark for the first time
Label, the sum that face training image is tested are denoted as 1, and second of output nametags, face training image is tested
Sum is denoted as 2, and so on, the M times output nametags, the sum that face training image is tested is denoted as M, counts M times
The expected different quantity of nametags corresponding with face training image, is denoted as A for the quantity, by output in nametags
The error rate of nametags expected nametags corresponding with face training image is denoted as first error rate, and first error rate utilizes
The nametags expected different quantity of nametags corresponding with face training image is tested divided by face training image
Sum obtain, i.e. A/M, if first error rate be less than or equal to the first preset threshold, illustrate nametags and face training scheme
As the corresponding expected different quantity of nametags meets the first preset condition, S105 is entered step, otherwise, i.e. first error
Rate is greater than the first preset threshold, illustrates the nametags expected different quantity of nametags corresponding with face training image not
Meet the first preset condition, enters step S104.
Assuming that the first preset threshold is 30%, first error rate such as 90%, it is default to be greater than first for first error rate at this time
Threshold value, then explanation is unable to fully be fitted, i.e., fitting degree is inadequate, illustrates that current deep learning neural network can not be by training data
It effectively is mapped to output label, illustrates that hidden layers numbers are inadequate, enough mapping relations can not be contained, so needing to increase hidden layer
The number of plies;First error rate such as 20%, first error rate is less than or equal to the first preset threshold at this time, then illustrates that fitting degree is enough
, so next needing to judge whether excessively to be fitted.
S104, increase a hidden layer before classifier in current deep learning neural network.
If deep learning neural network not sufficiently fitting during sample learning, that company of will lead to is to training sample
Data tested it is all different surely obtain correct anticipated output label, so as to cause the failure of deep learning.And in depth
In degree study, if not sufficiently fitting, illustrates that the feature of sample data may be added to unnecessary spy in cognitive process
Property details, and this characteristic details often interferes classification, therefore just needs to increase a hidden layer, fills until just reaching
Divide fitting.
The step is the number of plies by increasing hidden layer, to increase the degree of fitting, specifically, with the defeated of the last one hidden layer
It is connected out with the input for the hidden layer being newly inserted by encoding and decoding network, using the output for the hidden layer being newly inserted into as the defeated of classifier
Enter, be inserted into the schematic diagram of new hidden layer as shown in figure 3, in figure the hidden layer of circles mark be insertion new hidden layer, be inserted into new hidden layer and return
Step S101 is returned to continue to execute.
Preferably, the number of nodes for the hidden layer being newly inserted into can be less than or equal to original number of nodes of the last one hidden layer, it
So allowing the number of nodes for the hidden layer being newly inserted into that can be less than or equal to original number of nodes of the last one hidden layer, rather than it is greater than most
Original number of nodes of the latter hidden layer is neglected because can make the top layer concept for inputting classifier more abstract in this way
The feature that can not be mapped with output label, taking out can be with output label energy sufficiently corresponding feature.
S105, test input data is inputted into current deep learning neural network, passes through current deep learning neural network
The second output data is calculated.
Testing input data may be facial image, and the second output data may be nametags, specifically, pass through
Each face test image inputs current deep learning neural network, is exported by deep learning neural computing
Nametags.
S106, judge whether the second output data legitimate reading data corresponding with test input data are identical.
Specifically, judge whether the nametags expected nametags corresponding with face test image of output are identical, often
The output of nametags is all recorded in the sum that face test image is tested, namely output name mark for the first time
Label, the sum that face test image is tested are denoted as 1, and second of output nametags, face test image is tested
Sum is denoted as 2, and so on, n-th exports nametags, and the sum that face test image is tested is denoted as N, counts n times
The different quantity of Real Name label corresponding with face test image, is denoted as B for the quantity, by output in nametags
The error rate of nametags Real Name label corresponding with face test image is denoted as the second error rate, and the second error rate utilizes
The nametags different quantity of Real Name label corresponding with face training image is tested divided by face test image
Sum obtain, i.e. B/N illustrate to meet nametags and face survey if the second error rate is less than or equal to the second preset threshold
Attempt to enter step S108 as different the second preset condition of quantity of corresponding Real Name label, otherwise, i.e. the second error
Rate is greater than the second preset threshold, illustrates the nametags different quantity of Real Name label corresponding with face test image not
Meet the second preset condition, enters step S107.
Assuming that the second preset threshold is 20%, the second error rate such as 80%, it is default to be greater than second for the second error rate at this time
Threshold value, then explanation excessively fitting, illustrates that the feature for testing input data may be ignored necessary characteristic in cognitive process
Details, and institute is indispensable when this characteristic details is exactly classified, therefore just needs to reduce the number of plies of hidden layer, Zhi Daogang
Reach abundant fitting well;Second error rate such as 10%, the second error rate is less than or equal to the second preset threshold at this time, then illustrates
Just reach abundant fitting.In addition, the second error rate is generally less than first error rate, because even the feelings being just sufficiently fitted
Under condition, the accuracy of training data is always lower than using the accuracy of test data, so the second error rate generally all can be than
One error rate is small.
S107, the previous hidden layer of classifier in current deep learning neural network is deleted.
If deep learning neural network is excessively fitted during sample learning, that be will lead to through sample after training
Correct anticipated output label is all obtained after data input, but much cannot get correct output label after test data input,
So as to cause the failure of deep learning.And in deep learning, if excessively fitting, illustrate that the feature of sample data may be
Be ignored necessary behavioural details in cognitive process, and institute is indispensable when this characteristic details is exactly classified, because
This layer that just needs to successively decrease downwards, until just reaching abundant fitting.
The step is the number of plies by reducing hidden layer, to reduce the degree of fitting, specifically, current depth is learnt nerve
The previous hidden layer of classifier is deleted in network, specifically: learn nerve by current depth of the node of penultimate hidden layer
The input node of classifier in network deletes the schematic diagram of hidden layer as shown in figure 4, the hidden layer of circles mark is to need to delete in figure
Hidden layer, delete hidden layer after return step S101 continue to execute.
Preferably, the number of nodes of penultimate hidden layer is typically greater than the number of nodes of the last one hidden layer, in this way can be with
So that input classifier top layer concept specifically, specifically out it is ignored can with output label can sufficiently corresponding feature.
S108, output current deep learning neural network.
When second error rate is less than or equal to the second preset threshold, terminate supervised training, output current depth study nerve
Network, as the deep learning neural network that can just be sufficiently fitted, the deep learning neural network that can be sufficiently fitted refers to
Be that insufficient will not only be fitted, but also will not excessively be fitted, but just right fitting.At this point, if test input data
It is input to current deep learning neural network, the top layer concept obtained by unsupervised learning, then the concept can be farthest
Reflect and tests input data feature corresponding with anticipated output label, this feature input classifier, natural energy maximum probability
It obtains and expected label output.
Those of ordinary skill in the art will appreciate that implement the method for the above embodiments be can be with
Relevant hardware is instructed to complete by program, corresponding program can be stored in a computer readable storage medium,
The storage medium, such as ROM/RAM, disk or CD.
Embodiment 2:
As shown in figure 5, present embodiments providing a kind of number of plies increase and decrease deep learning neural metwork training system, the system packet
It includes training module 501, the first input module 502, first judgment module 503, hidden layer and increases module 504, the second input module
505, the second judgment module 506, hidden layer removing module 507 and output module 508, the concrete function of modules are as follows:
The training module 501, for passing through sample training current deep learning neural network;Wherein, the current depth
Spending learning neural network includes input layer, hidden layer, classifier and output layer.
First input module 502, for input data will to be trained to input current deep learning neural network, by working as
Preceding deep learning neural computing obtains the first output data.
The first judgment module 503, for judging the first output data anticipated output corresponding with training input data
Whether data are identical.
The hidden layer increases module 504, for when the first output data anticipated output number corresponding with training input data
When closing the first preset condition according to different discrepancy of quantity, need to increase before classifier in current deep learning neural network
One hidden layer, specifically: it is connected with the input for the hidden layer being newly inserted by encoding and decoding network with the output of the last one hidden layer, with
Input of the output for the hidden layer being newly inserted into as classifier in current deep learning neural network;Wherein, the first preset condition packet
Include: the error rate of the first output data anticipated output data corresponding with training input data is less than or equal to the first default threshold
Value.
Second input module 505, for when the first output data anticipated output number corresponding with training input data
When meeting the first preset condition according to different quantity, test input data is inputted into current deep learning neural network, is passed through
Deep learning neural computing obtains the second output data.
Second judgment module 506, for judging the second output data legitimate reading corresponding with test input data
Whether data are identical.
The hidden layer removing module 507, for when the second output data legitimate reading number corresponding with test input data
When closing the second preset condition according to different discrepancy of quantity, by the previous hidden layer of classifier in current deep learning neural network
It deletes, specifically: using the node of penultimate hidden layer as the input node of classifier in current deep learning neural network;Its
In, the second preset condition includes: that the error rate of the second output data legitimate reading data corresponding with test input data is less than
Or it is equal to the second preset threshold.
The output module 508, for working as the second output data legitimate reading data corresponding with test input data not
When identical quantity meets the second preset condition, current deep learning neural network is exported.
It should be noted that system provided by the above embodiment is only illustrated with the division of above-mentioned each functional module
Illustrate, in practical applications, can according to need and be completed by different functional modules above-mentioned function distribution, i.e., by internal junction
Structure is divided into different functional modules, to complete all or part of the functions described above.
It is appreciated that term " first ", " second " used in the robot system of above-described embodiment etc. can be used for describing
Various units, but these units should not be limited by these terms.These terms are only used to by first module and another block region
Point.For example, without departing from the scope of the invention, first judgment module can be known as to the second judgment module,
And similarly, the second judgment module can be known as to first judgment module, first judgment module and the second judgment module are both
Judgment module, but it is not same judgment module.
Embodiment 3:
A kind of storage medium is present embodiments provided, which is stored with one or more programs, described program quilt
When processor executes, realize that the number of plies of above-described embodiment 1 increases and decreases deep learning neural network training method, as follows:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated
Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network
To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first
When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance
If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing
Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second
When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance
If when condition, exporting current deep learning neural network.
Storage medium described in the present embodiment can be the media such as ROM, RAM, disk, CD.
Embodiment 4:
A kind of calculating equipment is present embodiments provided, which includes processor and memory, and memory is stored with
One or more programs when processor executes the program of memory storage, realize that the number of plies of above-described embodiment 1 increases and decreases deep learning
Neural network training method, as follows:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated
Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network
To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first
When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance
If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing
Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second
When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance
If when condition, exporting current deep learning neural network.
It is hand-held eventually that calculating equipment described in the present embodiment can be desktop computer, laptop, smart phone, PDA
End, tablet computer or other terminal devices having a display function.
In conclusion the present invention will train input data to input deep learning neural network, pass through deep learning nerve net
The first output data is calculated in network, if the first output data anticipated output data corresponding with training input data are not identical
Discrepancy of quantity close preset condition, then increase the number of plies of hidden layer;If the first output data and training input data are corresponding pre-
The different quantity of phase output data meets preset condition, then test input data is inputted deep learning neural network, passed through
Deep learning neural computing obtains the second output data, if the test corresponding with test input data of the second output data
The different discrepancy of quantity of output data closes preset condition, then reduces the number of plies of hidden layer, by increasing or decreasing the number of plies of hidden layer,
Until just reaching abundant fitting, that is, the correction of fitting is realized, so that top layer concept just has distinguishes different samples enough
The characteristic information of data, thus can be completely corresponding with anticipated output data, legitimate reading data, top when can reach abundant fitting
Layer concept is exactly the concept for being just enough sufficiently to be fitted with output label.
The above, only the invention patent preferred embodiment, but the scope of protection of the patent of the present invention is not limited to
This, anyone skilled in the art is in the range disclosed in the invention patent, according to the present invention the skill of patent
Art scheme and its inventive concept are subject to equivalent substitution or change, belong to the scope of protection of the patent of the present invention.
Claims (10)
1. the number of plies increases and decreases deep learning neural network training method, it is characterised in that: the described method includes:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network include input layer,
Hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated the by current deep learning neural network
One output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
It is preset when the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first
When condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets the first default item
When part, test input data is inputted into current deep learning neural network, obtains second by deep learning neural computing
Output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
It is preset when the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second
When condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets the second default item
When part, current deep learning neural network is exported.
2. the number of plies according to claim 1 increases and decreases deep learning neural network training method, it is characterised in that: described to work as
Increase a hidden layer in preceding deep learning neural network before classifier, specifically: it is with the output of the last one hidden layer and newly slotting
The input of the hidden layer entered is connected by encoding and decoding network, using the output for the hidden layer being newly inserted into as current deep learning neural network
The input of middle classifier.
3. the number of plies according to claim 2 increases and decreases deep learning neural network training method, it is characterised in that: described new slotting
The number of nodes of the hidden layer entered is less than or equal to the number of nodes of the last one hidden layer.
4. the number of plies according to claim 1 increases and decreases deep learning neural network training method, it is characterised in that: described to work as
In preceding deep learning neural network classifier previous hidden layer delete, specifically: with the node of penultimate hidden layer be work as
The input node of classifier in preceding deep learning neural network.
5. the number of plies according to claim 4 increases and decreases deep learning neural network training method, it is characterised in that: the inverse
The number of nodes of second hidden layer is greater than the number of nodes of the last one hidden layer.
6. the number of plies according to claim 1-5 increases and decreases deep learning neural network training method, it is characterised in that:
First preset condition includes: that the error rate of the first output data anticipated output data corresponding with training input data is less than
Or it is equal to the first preset threshold;
The calculating process of the error rate of first output data anticipated output data corresponding with training input data are as follows: first
The output data different quantity of anticipated output data corresponding with training input data is tested divided by training input data
Sum.
7. the number of plies according to claim 1-5 increases and decreases deep learning neural network training method, it is characterised in that:
Second preset condition includes: that the error rate of the second output data legitimate reading data corresponding with test input data is less than
Or it is equal to the second preset threshold;
The calculating process of the error rate of second output data legitimate reading data corresponding with test input data are as follows: second
The output data different quantity of legitimate reading data corresponding with test input data is tested divided by test input data
Sum.
8. the number of plies increases and decreases deep learning neural metwork training system, it is characterised in that: the system comprises:
Training module, for passing through sample training current deep learning neural network;Wherein, the current depth learns nerve net
Network includes input layer, hidden layer, classifier and output layer;
First input module is learnt for input data will to be trained to input current deep learning neural network by current depth
Neural computing obtains the first output data;
First judgment module, for judge the first output data anticipated output data corresponding with trained input data whether phase
Together;
Hidden layer increases module, for when the first output data different number of anticipated output data corresponding with training input data
When amount does not meet the first preset condition, increase a hidden layer before classifier in current deep learning neural network;
Second input module, for when the first output data different number of anticipated output data corresponding with training input data
When amount meets the first preset condition, test input data is inputted into current deep learning neural network, passes through deep learning nerve
Network query function obtains the second output data;
Second judgment module, for judge the second output data with test the corresponding legitimate reading data of input data whether phase
Together;
Hidden layer removing module, for when the second output data different number of legitimate reading data corresponding with test input data
When amount does not meet the second preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
Output module, for being accorded with when the second output data different quantity of legitimate reading data corresponding with test input data
When closing the second preset condition, current deep learning neural network is exported.
9. storage medium is stored with program, it is characterised in that: when described program is executed by processor, realize that claim 1-7 appoints
The number of plies described in one increases and decreases deep learning neural network training method.
10. equipment is calculated, including processor and for the memory of storage processor executable program, it is characterised in that: institute
When stating the program of processor execution memory storage, the described in any item numbers of plies increase and decrease deep learnings minds of claim 1-7 are realized
Through network training method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810823422.7A CN108985456B (en) | 2018-07-25 | 2018-07-25 | Number-of-layers-increasing deep learning neural network training method, system, medium, and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810823422.7A CN108985456B (en) | 2018-07-25 | 2018-07-25 | Number-of-layers-increasing deep learning neural network training method, system, medium, and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108985456A true CN108985456A (en) | 2018-12-11 |
CN108985456B CN108985456B (en) | 2021-06-22 |
Family
ID=64550396
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810823422.7A Active CN108985456B (en) | 2018-07-25 | 2018-07-25 | Number-of-layers-increasing deep learning neural network training method, system, medium, and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985456B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734308A (en) * | 2021-03-10 | 2021-04-30 | 张怡然 | Data collaboration system and method based on neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150134583A1 (en) * | 2013-11-14 | 2015-05-14 | Denso Corporation | Learning apparatus, learning program, and learning method |
CN104700153A (en) * | 2014-12-05 | 2015-06-10 | 江南大学 | PH (potential of hydrogen) value predicting method of BP (back propagation) neutral network based on simulated annealing optimization |
US9195935B2 (en) * | 2012-04-30 | 2015-11-24 | The Regents Of The University Of California | Problem solving by plastic neuronal networks |
CN105787557A (en) * | 2016-02-23 | 2016-07-20 | 北京工业大学 | Design method of deep nerve network structure for computer intelligent identification |
CN106709511A (en) * | 2016-12-08 | 2017-05-24 | 华中师范大学 | Urban rail transit panoramic monitoring video fault detection method based on depth learning |
CN108171329A (en) * | 2017-12-13 | 2018-06-15 | 华南师范大学 | Deep learning neural network training method, number of plies adjusting apparatus and robot system |
-
2018
- 2018-07-25 CN CN201810823422.7A patent/CN108985456B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9195935B2 (en) * | 2012-04-30 | 2015-11-24 | The Regents Of The University Of California | Problem solving by plastic neuronal networks |
US20150134583A1 (en) * | 2013-11-14 | 2015-05-14 | Denso Corporation | Learning apparatus, learning program, and learning method |
CN104700153A (en) * | 2014-12-05 | 2015-06-10 | 江南大学 | PH (potential of hydrogen) value predicting method of BP (back propagation) neutral network based on simulated annealing optimization |
CN105787557A (en) * | 2016-02-23 | 2016-07-20 | 北京工业大学 | Design method of deep nerve network structure for computer intelligent identification |
CN106709511A (en) * | 2016-12-08 | 2017-05-24 | 华中师范大学 | Urban rail transit panoramic monitoring video fault detection method based on depth learning |
CN108171329A (en) * | 2017-12-13 | 2018-06-15 | 华南师范大学 | Deep learning neural network training method, number of plies adjusting apparatus and robot system |
Non-Patent Citations (2)
Title |
---|
余乐安 等著: "《外汇汇率与国际原油价格波动预测》", 30 June 2006 * |
夜月XL: "神经网络中隐层数和隐层节点数问题的讨论", 《CSDN:HTTPS://BLOG.CSDN.NET/U013045749/ARTICLE/DETAILS/40783281》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734308A (en) * | 2021-03-10 | 2021-04-30 | 张怡然 | Data collaboration system and method based on neural network |
CN112734308B (en) * | 2021-03-10 | 2023-06-27 | 张怡然 | Data collaboration system and method based on neural network |
Also Published As
Publication number | Publication date |
---|---|
CN108985456B (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104679863B (en) | It is a kind of based on deep learning to scheme to search drawing method and system | |
CN108171329A (en) | Deep learning neural network training method, number of plies adjusting apparatus and robot system | |
CN111126218B (en) | Human behavior recognition method based on zero sample learning | |
CN109783666B (en) | Image scene graph generation method based on iterative refinement | |
EP3961441B1 (en) | Identity verification method and apparatus, computer device and storage medium | |
CN108388876A (en) | A kind of image-recognizing method, device and relevant device | |
CN110188343A (en) | Multi-modal emotion identification method based on fusion attention network | |
CN106951825A (en) | A kind of quality of human face image assessment system and implementation method | |
CN107437077A (en) | A kind of method that rotation face based on generation confrontation network represents study | |
CN109447099B (en) | PCA (principal component analysis) dimension reduction-based multi-classifier fusion method | |
CN106068514A (en) | For identifying the system and method for face in free media | |
CN108182409A (en) | Biopsy method, device, equipment and storage medium | |
CN107506786A (en) | A kind of attributive classification recognition methods based on deep learning | |
US10986400B2 (en) | Compact video representation for video event retrieval and recognition | |
CN106778852A (en) | A kind of picture material recognition methods for correcting erroneous judgement | |
CN107871107A (en) | Face authentication method and device | |
US11823490B2 (en) | Non-linear latent to latent model for multi-attribute face editing | |
Zhu et al. | Convolutional ordinal regression forest for image ordinal estimation | |
CN104679967B (en) | A kind of method for judging psychological test reliability | |
CN109271546A (en) | The foundation of image retrieval Feature Selection Model, Database and search method | |
CN113822953A (en) | Processing method of image generator, image generation method and device | |
CN113705596A (en) | Image recognition method and device, computer equipment and storage medium | |
Wang et al. | Learning to augment expressions for few-shot fine-grained facial expression recognition | |
Hu et al. | Multi-perspective cost-sensitive context-aware multi-instance sparse coding and its application to sensitive video recognition | |
CN112101087A (en) | Facial image identity de-identification method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |