CN108985456A - The number of plies increases and decreases deep learning neural network training method, system, medium and equipment - Google Patents

The number of plies increases and decreases deep learning neural network training method, system, medium and equipment Download PDF

Info

Publication number
CN108985456A
CN108985456A CN201810823422.7A CN201810823422A CN108985456A CN 108985456 A CN108985456 A CN 108985456A CN 201810823422 A CN201810823422 A CN 201810823422A CN 108985456 A CN108985456 A CN 108985456A
Authority
CN
China
Prior art keywords
deep learning
neural network
learning neural
output data
hidden layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810823422.7A
Other languages
Chinese (zh)
Other versions
CN108985456B (en
Inventor
朱定局
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daguo Innovation Intelligent Technology Dongguan Co ltd
Original Assignee
Daguo Innovation Intelligent Technology Dongguan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daguo Innovation Intelligent Technology Dongguan Co ltd filed Critical Daguo Innovation Intelligent Technology Dongguan Co ltd
Priority to CN201810823422.7A priority Critical patent/CN108985456B/en
Publication of CN108985456A publication Critical patent/CN108985456A/en
Application granted granted Critical
Publication of CN108985456B publication Critical patent/CN108985456B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a kind of number of plies increase and decrease deep learning neural network training method, system, medium and equipment, the described method includes: training input data is inputted current depth learning neural network, the first output data is calculated by current depth learning neural network;Judge whether the first output data is identical as anticipated output data;If not meeting the first preset condition, increase a hidden layer before classifier in current depth learning neural network;Otherwise, test input data is inputted into current depth learning neural network, the second output data is obtained by deep learning neural computing;Judge whether the second output data is identical as legitimate reading data;If not meeting the second preset condition, the previous hidden layer of classifier in current depth learning neural network is deleted;Otherwise, current depth learning neural network is exported.Top layer concept is exactly the concept for being just enough sufficiently to be fitted with output data when the present invention can reach abundant fitting.

Description

The number of plies increases and decreases deep learning neural network training method, system, medium and equipment
Technical field
The present invention relates to a kind of deep learning neural network training method, especially a kind of number of plies increase and decrease deep learning nerve Network training method, system, medium and equipment belong to neural metwork training field.
Background technique
Existing deep learning technology can be obtained output label by input data and (such as obtain the people's identity card by head portrait Number, the people's identification card number is for another example obtained by voice), it has in the top-down supervised training stage through tape label data Supervised training (such as head portrait with identification card number label for another example has the voice of identification card number label).
But it the top-down supervised training of existing deep learning technology or only adjusts between output layer and hidden layer Network weight or the network weight of all layers of adjusting.When the classification of top layer concept is more than label classification, if only adjusted Classifier network weight between output layer and hidden layer adjusts classifier if the network structure of classifier is fairly simple repeatedly The result of network parameter often met this output label and can not meet that output label again, that is to say, that Wu Fashi Now adequately fitting.If the network structure of classifier is designed extremely complex, such as with the BP neural network of level complexity As classifier, then the case where will appear over-fitting again, so that certain key features are rejected in fitting, so that right Classification results are completely correct for sample, but to application when it finds that not right.
As it can be seen that only exercising supervision to level between output layer and hidden layer trains or can be unable to fully fitting or meeting Over-fitting can all lead to deep learning failure in application.If the network weight for adjusting all layers, can destroy hidden layer again In cognition weight and generate weight so that obtained concept and scene are no longer entirely the spy derived from input data after adjusting The phenomenon that seeking peace scene, but the feature and scene being twisted for the needs of output label, equally will appear over-fitting, from And make the classification results for sample completely correct, but to application when it finds that not right.
Summary of the invention
The first purpose of this invention is the defect in order to solve the above-mentioned prior art, provides a kind of number of plies increase and decrease depth Learning neural network training method, this method make top layer concept just have the feature letter for distinguishing different sample datas enough Breath, thus can be completely corresponding with anticipated output data, legitimate reading data, top layer concept is exactly rigid when can reach abundant fitting It is enough the concept being sufficiently fitted with output data well.
Second object of the present invention is to provide a kind of number of plies increase and decrease deep learning neural metwork training system.
Third object of the present invention is to provide a kind of storage medium.
Fourth object of the present invention is to provide a kind of calculating equipment.
The first purpose of this invention can be reached by adopting the following technical scheme that:
The number of plies increases and decreases deep learning neural network training method, which comprises
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance If when condition, exporting current deep learning neural network.
It is further, described to increase a hidden layer before classifier in current deep learning neural network, specifically: with The output of the last one hidden layer is connected with the input for the hidden layer being newly inserted by encoding and decoding network, with the output for the hidden layer being newly inserted into Input as classifier in current deep learning neural network.
Further, the number of nodes of the hidden layer of the new insertion is less than or equal to the number of nodes of the last one hidden layer.
Further, the previous hidden layer by classifier in current deep learning neural network is deleted, specifically: with The node of penultimate hidden layer is the input node of classifier in current deep learning neural network.
Further, the number of nodes of the penultimate hidden layer is greater than the number of nodes of the last one hidden layer.
Further, first preset condition includes: that the expection corresponding with training input data of the first output data is defeated The error rate of data is less than or equal to the first preset threshold out;
The calculating process of the error rate of first output data anticipated output data corresponding with training input data are as follows: The first output data different quantity of anticipated output data corresponding with training input data is carried out divided by training input data The sum of test.
Further, second preset condition includes: the second output data true knot corresponding with test input data The error rate of fruit data is less than or equal to the second preset threshold;
The calculating process of the error rate of second output data legitimate reading data corresponding with test input data are as follows: The second output data different quantity of legitimate reading data corresponding with test input data is carried out divided by test input data The sum of test.
Second object of the present invention can be reached by adopting the following technical scheme that:
The number of plies increases and decreases deep learning neural metwork training system, the system comprises:
Training module, for passing through sample training current deep learning neural network;Wherein, the current depth study mind It include input layer, hidden layer, classifier and output layer through network;
First input module passes through current depth for input data will to be trained to input current deep learning neural network The first output data is calculated in learning neural network;
First judgment module, for whether judging the first output data anticipated output data corresponding with training input data It is identical;
Hidden layer increases module, not identical for working as the first output data anticipated output data corresponding with training input data Discrepancy of quantity close the first preset condition when, in current deep learning neural network before classifier increase a hidden layer;
Second input module is not identical for working as the first output data anticipated output data corresponding with training input data Quantity when meeting the first preset condition, test input data is inputted into current deep learning neural network, passes through deep learning Neural computing obtains the second output data;
Second judgment module, for whether judging the second output data legitimate reading data corresponding with test input data It is identical;
Hidden layer removing module is not identical for working as the second output data legitimate reading data corresponding with test input data Discrepancy of quantity close the second preset condition when, by current deep learning neural network classifier previous hidden layer delete;
Output module, for when the second output data different number of legitimate reading data corresponding with test input data When amount meets the second preset condition, current deep learning neural network is exported.
Third object of the present invention can be reached by adopting the following technical scheme that:
Storage medium is stored with program, when described program is executed by processor, realizes above-mentioned number of plies increase and decrease deep learning Neural network training method.
Fourth object of the present invention can be reached by adopting the following technical scheme that:
Equipment is calculated, including processor and for the memory of storage processor executable program, the processor is held When the program of line storage storage, above-mentioned number of plies increase and decrease deep learning neural network training method is realized.
The present invention have compared with the existing technology it is following the utility model has the advantages that
1, training input data is inputted deep learning neural network by the present invention, is obtained by deep learning neural computing To the first output data, if the first output data different quantity of anticipated output data corresponding with training input data is not Meet preset condition, then increases the number of plies of hidden layer;If the first output data anticipated output number corresponding with training input data Meet preset condition according to different quantity, then test input data is inputted into deep learning neural network, pass through deep learning Neural computing obtains the second output data, if the second output data test output data corresponding with test input data Different discrepancy of quantity closes preset condition, then the number of plies of hidden layer is reduced, by increasing or decreasing the number of plies of hidden layer, until just Reach abundant fitting, that is, realize the correction of fitting, so that top layer concept just has the spy for distinguishing different sample datas enough Reference breath, thus can be completely corresponding with anticipated output data, legitimate reading data, when can reach abundant fitting, top layer concept be just It is the concept for being just enough sufficiently to be fitted with output label.
2, the present invention is when increasing the hidden layer of deep learning neural network, with the output of the last one hidden layer be newly inserted into The input of hidden layer is connected by encoding and decoding network, and using the output for the hidden layer being newly inserted into as the input of classifier, that is newly inserted into is hidden The number of nodes of layer is less than or equal to the number of nodes of the last one hidden layer, and the top layer concept for inputting classifier can be made more to take out As neglecting the feature that can not be mapped with output label, taking out can be with output data energy sufficiently corresponding feature.
3, the present invention is current deep with the node of penultimate hidden layer when reducing the hidden layer of deep learning neural network The input node of classifier in learning neural network is spent, the number of nodes of penultimate hidden layer is typically greater than the last one hidden layer Number of nodes, can make in this way input classifier top layer concept specifically, specifically out it is ignored can and output data It can abundant corresponding feature.
Detailed description of the invention
Fig. 1 is that the number of plies of the embodiment of the present invention 1 increases and decreases the flow chart of deep learning neural network training method.
Fig. 2 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method The illustraton of model of network.
Fig. 3 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method Network is inserted into the schematic diagram of new hidden layer.
Fig. 4 is that the number of plies of the embodiment of the present invention 1 increases and decreases deep learning nerve net in deep learning neural network training method The schematic diagram of network deletion hidden layer.
Fig. 5 is that the number of plies of the embodiment of the present invention 2 increases and decreases the structural block diagram of deep learning neural metwork training system.
Specific embodiment
Present invention will now be described in further detail with reference to the embodiments and the accompanying drawings, but embodiments of the present invention are unlimited In this.
Embodiment 1:
The explanation and establishment process of deep learning are as follows:
From one input in generate one output involved in calculate can by a flow graph (flow graph) come Indicate: flow graph is a kind of figure that can indicate to calculate, in this figure each node indicate a basic calculating and The value of one calculating, the result of calculating are applied to the value of the child node of this node.Consider such a set of computations, it can To be allowed in each node and possible graph structure, and define a family of functions.Input node does not have father node, defeated Egress does not have child node.
One of this flow graph is special, and attribute is depth (depth): the longest path of an output is input to from one Length.
Learning structure is regarded as a network, then the core ideas of deep learning is as follows:
Step 1: using unsupervised training from bottom to top
1) successively building monolayer neuronal is first.
2) tuning is carried out using wake-sleep algorithm every layer.One layer is only adjusted every time, is successively adjusted.
This process can be regarded as the process of a feature learning, be to distinguish maximum with traditional neural network Part.Wake-sleep algorithm: 1) the wake stage: cognitive process passes through the input feature vector (Input) of lower layer and recognizing upwards Know that (Encoder) weight generates each layer of abstract representation (Code), then generates one by current generation (Decoder) weight A reconstruction information (Reconstruction) calculates input feature vector and reconstruction information residual error, declines modification interlayer using gradient Downlink generates (Decoder) weight.Namely " if reality imagines different with me, the generation weight for changing me makes me The thing of the imagination becomes as reality ".
2) the sleep stage:
Generating process generates the state of lower layer by Upper Concept (Code) and downward generation (Decoder) weight, Cognition (Encoder) weight is recycled to generate an abstract scene.Using the residual error of initial upper layer concept and newly-built abstract scene, Decline upward cognition (Encoder) weight of modification interlayer using gradient.Namely " if the scene in dream is not in my brain Corresponding concepts, changing my cognition weight to make this scene in my view is exactly this concept ".
Step 2: top-down supervised training
This step be first step study obtain each layer parameter on the basis of, add a classification in the coding layer most pushed up Device (such as Rogers spy's recurrence, SVM etc.) then passes through the supervised training of tape label data, goes to finely tune using gradient descent method Whole network parameter.
The first step of deep learning is substantially a network parameter initialization procedure.It is different from traditional neural network initial value Random initializtion, deep learning neural network are obtained by the structure of unsupervised trained input data, thus this initial value Closer to global optimum, so as to obtain better effect.
It is exercised supervision during training to deep learning neural network, if the number of plies is excessive, top layer can be made general The details for being excessively abstracted, lacking differentiation is read, and joined " noise " that can just distinguish output label during supervised training Details is fitted in training so that top layer concept is excessively fitted with label and joined non-area in top layer concept The noise of other property feature, necessarily leads to that error rate can be increased in the test below.Such as label includes " white man ", " white Color woman ", " black man ", " black woman ", and top layer conceptual abstraction has arrived " man ", " woman " feature, has ignored " black Color ", " white " feature, all sample datas can be fitted to " man ", on " woman " this 2 labels at this time, because of depth Practise using from bottom to top unsupervised training and top-down supervised training, in unsupervised training from bottom to top, This few class sample data of " white man ", " black man ", it is clear that can be corresponding with " man " top layer concept, " white woman ", " black woman " this few class sample data, it is clear that can be corresponding with " woman " top layer concept, in top-down supervised training, This few class label of " white man ", " black man ", it is clear that can be corresponding with " man " top layer concept, " white woman ", " black This few class label of woman ", it is clear that can be corresponding with " woman " top layer concept, pass through supervised training, deep learning neural network meeting Adjust automatically network weight eventually makes " man+noise 1 " top layer concept corresponding " black man ", " man+noise 2 " top Layer concept is corresponding " white man ", and " woman+noise 3 " top layer concept is corresponding " black woman ", " woman+noise 4 " top layer concept Corresponding " white woman ", due to being fitted repeatedly during supervised training, to be to reach for training data The effect being sufficiently fitted.But in use, when input " white man " test data, obtained top layer concept " man+noise 2 ", but because noise 2 is not to discriminate between the distinctive feature of white man Yu black man, it is corresponded at this time by classifier Label may be " white man ", it is also possible to and " black man ", or even can due to the output label that the interference of noise 2 obtains It can be " white woman " or " black woman ", so that error rate when test will be made to increase.
If the number of plies is very few, top layer concept can be made excessively specific, unrelated or even contradictory with output label differentiation Details, so that top layer concept cannot be fitted with label, that is to say, that top layer concept and label can not be made to correspond Come.Such as label includes " man ", " woman ", and the feature of top layer concept is gone back other than including the essential feature of difference men and women Including hair feature, features of skin colors.Because deep learning using from bottom to top unsupervised training and top-down prison White silk is supervised and instructed, in top-down supervised training, if just beginning with very much " bob man ", " black men ", " white female The training sample of people ", " long hair woman ", then will to form " bob man ", " black man ", " white in top layer concept The concept of color woman ", " long hair woman ", wherein " bob man ", " black man " export mark by classifier corresponding " man " Label, " white woman ", " long hair woman " are by corresponding " woman " output label of classifier, but if there is a large amount of " long hairs again later The training sample of man ", " black woman " will make top layer concept be adjusted to " long hair man ", " black man ", " black The concept of woman ", " long hair woman ", but " bob man ", " white woman " those samples obviously can not be fitted again after adjusting, So as to cause the adjustment network weight that deep learning neural network is constant with the variation of sample, but it is unable to fully be fitted always, It is since the network number of plies is inadequate, level of abstraction is insufficient, the details of non-distinguishing characteristics is weeded out not over enough be abstracted, institute To need to increase the network number of plies.
Therefore, a kind of number of plies increase and decrease deep learning neural network training method is present embodiments provided, this method to push up Layer concept just has the characteristic information for distinguishing different sample datas enough, thus with anticipated output data, legitimate reading data Can correspond to completely, when can reach abundant fitting top layer concept be exactly just be enough sufficiently to be fitted with output data it is general It reads.
As shown in Figure 1, the present embodiment the number of plies increase and decrease deep learning neural network training method the following steps are included:
S101, pass through sample training current deep learning neural network.
Current deep learning neural network obtains: initialization deep learning neural network obtains predetermined depth study nerve Network, as current deep learning neural network, as shown in Fig. 2, deep learning neural network includes input layer, hidden layer, classification Device and output layer are wherein classifier between hidden layer and output layer, which is the input of classifier, and output layer is classifier Output.
By sample training current deep learning neural network, such as: by facial image and nametags, training is obtained Current deep learning neural network.
S102, training input data is inputted into current deep learning neural network, passes through current deep learning neural network The first output data is calculated.
Training input data can be facial image, and specifically, training input data can be to be obtained by acquisition, example Such as: being acquired by camera and obtain facial image;Training input data is also possible to obtain from database lookup, such as: in advance Facial image is stored up in databases, facial image is searched for from database can be obtained trained input data.
First output data can be nametags, specifically, input current depth by each face training image Learning neural network, the nametags exported by deep learning neural computing.
S103, judge whether the first output data anticipated output data corresponding with training input data are identical.
Specifically, judge whether the nametags expected nametags corresponding with face training image of output are identical, often The output of nametags is all recorded in the sum that face training image is tested, namely output name mark for the first time Label, the sum that face training image is tested are denoted as 1, and second of output nametags, face training image is tested Sum is denoted as 2, and so on, the M times output nametags, the sum that face training image is tested is denoted as M, counts M times The expected different quantity of nametags corresponding with face training image, is denoted as A for the quantity, by output in nametags The error rate of nametags expected nametags corresponding with face training image is denoted as first error rate, and first error rate utilizes The nametags expected different quantity of nametags corresponding with face training image is tested divided by face training image Sum obtain, i.e. A/M, if first error rate be less than or equal to the first preset threshold, illustrate nametags and face training scheme As the corresponding expected different quantity of nametags meets the first preset condition, S105 is entered step, otherwise, i.e. first error Rate is greater than the first preset threshold, illustrates the nametags expected different quantity of nametags corresponding with face training image not Meet the first preset condition, enters step S104.
Assuming that the first preset threshold is 30%, first error rate such as 90%, it is default to be greater than first for first error rate at this time Threshold value, then explanation is unable to fully be fitted, i.e., fitting degree is inadequate, illustrates that current deep learning neural network can not be by training data It effectively is mapped to output label, illustrates that hidden layers numbers are inadequate, enough mapping relations can not be contained, so needing to increase hidden layer The number of plies;First error rate such as 20%, first error rate is less than or equal to the first preset threshold at this time, then illustrates that fitting degree is enough , so next needing to judge whether excessively to be fitted.
S104, increase a hidden layer before classifier in current deep learning neural network.
If deep learning neural network not sufficiently fitting during sample learning, that company of will lead to is to training sample Data tested it is all different surely obtain correct anticipated output label, so as to cause the failure of deep learning.And in depth In degree study, if not sufficiently fitting, illustrates that the feature of sample data may be added to unnecessary spy in cognitive process Property details, and this characteristic details often interferes classification, therefore just needs to increase a hidden layer, fills until just reaching Divide fitting.
The step is the number of plies by increasing hidden layer, to increase the degree of fitting, specifically, with the defeated of the last one hidden layer It is connected out with the input for the hidden layer being newly inserted by encoding and decoding network, using the output for the hidden layer being newly inserted into as the defeated of classifier Enter, be inserted into the schematic diagram of new hidden layer as shown in figure 3, in figure the hidden layer of circles mark be insertion new hidden layer, be inserted into new hidden layer and return Step S101 is returned to continue to execute.
Preferably, the number of nodes for the hidden layer being newly inserted into can be less than or equal to original number of nodes of the last one hidden layer, it So allowing the number of nodes for the hidden layer being newly inserted into that can be less than or equal to original number of nodes of the last one hidden layer, rather than it is greater than most Original number of nodes of the latter hidden layer is neglected because can make the top layer concept for inputting classifier more abstract in this way The feature that can not be mapped with output label, taking out can be with output label energy sufficiently corresponding feature.
S105, test input data is inputted into current deep learning neural network, passes through current deep learning neural network The second output data is calculated.
Testing input data may be facial image, and the second output data may be nametags, specifically, pass through Each face test image inputs current deep learning neural network, is exported by deep learning neural computing Nametags.
S106, judge whether the second output data legitimate reading data corresponding with test input data are identical.
Specifically, judge whether the nametags expected nametags corresponding with face test image of output are identical, often The output of nametags is all recorded in the sum that face test image is tested, namely output name mark for the first time Label, the sum that face test image is tested are denoted as 1, and second of output nametags, face test image is tested Sum is denoted as 2, and so on, n-th exports nametags, and the sum that face test image is tested is denoted as N, counts n times The different quantity of Real Name label corresponding with face test image, is denoted as B for the quantity, by output in nametags The error rate of nametags Real Name label corresponding with face test image is denoted as the second error rate, and the second error rate utilizes The nametags different quantity of Real Name label corresponding with face training image is tested divided by face test image Sum obtain, i.e. B/N illustrate to meet nametags and face survey if the second error rate is less than or equal to the second preset threshold Attempt to enter step S108 as different the second preset condition of quantity of corresponding Real Name label, otherwise, i.e. the second error Rate is greater than the second preset threshold, illustrates the nametags different quantity of Real Name label corresponding with face test image not Meet the second preset condition, enters step S107.
Assuming that the second preset threshold is 20%, the second error rate such as 80%, it is default to be greater than second for the second error rate at this time Threshold value, then explanation excessively fitting, illustrates that the feature for testing input data may be ignored necessary characteristic in cognitive process Details, and institute is indispensable when this characteristic details is exactly classified, therefore just needs to reduce the number of plies of hidden layer, Zhi Daogang Reach abundant fitting well;Second error rate such as 10%, the second error rate is less than or equal to the second preset threshold at this time, then illustrates Just reach abundant fitting.In addition, the second error rate is generally less than first error rate, because even the feelings being just sufficiently fitted Under condition, the accuracy of training data is always lower than using the accuracy of test data, so the second error rate generally all can be than One error rate is small.
S107, the previous hidden layer of classifier in current deep learning neural network is deleted.
If deep learning neural network is excessively fitted during sample learning, that be will lead to through sample after training Correct anticipated output label is all obtained after data input, but much cannot get correct output label after test data input, So as to cause the failure of deep learning.And in deep learning, if excessively fitting, illustrate that the feature of sample data may be Be ignored necessary behavioural details in cognitive process, and institute is indispensable when this characteristic details is exactly classified, because This layer that just needs to successively decrease downwards, until just reaching abundant fitting.
The step is the number of plies by reducing hidden layer, to reduce the degree of fitting, specifically, current depth is learnt nerve The previous hidden layer of classifier is deleted in network, specifically: learn nerve by current depth of the node of penultimate hidden layer The input node of classifier in network deletes the schematic diagram of hidden layer as shown in figure 4, the hidden layer of circles mark is to need to delete in figure Hidden layer, delete hidden layer after return step S101 continue to execute.
Preferably, the number of nodes of penultimate hidden layer is typically greater than the number of nodes of the last one hidden layer, in this way can be with So that input classifier top layer concept specifically, specifically out it is ignored can with output label can sufficiently corresponding feature.
S108, output current deep learning neural network.
When second error rate is less than or equal to the second preset threshold, terminate supervised training, output current depth study nerve Network, as the deep learning neural network that can just be sufficiently fitted, the deep learning neural network that can be sufficiently fitted refers to Be that insufficient will not only be fitted, but also will not excessively be fitted, but just right fitting.At this point, if test input data It is input to current deep learning neural network, the top layer concept obtained by unsupervised learning, then the concept can be farthest Reflect and tests input data feature corresponding with anticipated output label, this feature input classifier, natural energy maximum probability It obtains and expected label output.
Those of ordinary skill in the art will appreciate that implement the method for the above embodiments be can be with Relevant hardware is instructed to complete by program, corresponding program can be stored in a computer readable storage medium, The storage medium, such as ROM/RAM, disk or CD.
Embodiment 2:
As shown in figure 5, present embodiments providing a kind of number of plies increase and decrease deep learning neural metwork training system, the system packet It includes training module 501, the first input module 502, first judgment module 503, hidden layer and increases module 504, the second input module 505, the second judgment module 506, hidden layer removing module 507 and output module 508, the concrete function of modules are as follows:
The training module 501, for passing through sample training current deep learning neural network;Wherein, the current depth Spending learning neural network includes input layer, hidden layer, classifier and output layer.
First input module 502, for input data will to be trained to input current deep learning neural network, by working as Preceding deep learning neural computing obtains the first output data.
The first judgment module 503, for judging the first output data anticipated output corresponding with training input data Whether data are identical.
The hidden layer increases module 504, for when the first output data anticipated output number corresponding with training input data When closing the first preset condition according to different discrepancy of quantity, need to increase before classifier in current deep learning neural network One hidden layer, specifically: it is connected with the input for the hidden layer being newly inserted by encoding and decoding network with the output of the last one hidden layer, with Input of the output for the hidden layer being newly inserted into as classifier in current deep learning neural network;Wherein, the first preset condition packet Include: the error rate of the first output data anticipated output data corresponding with training input data is less than or equal to the first default threshold Value.
Second input module 505, for when the first output data anticipated output number corresponding with training input data When meeting the first preset condition according to different quantity, test input data is inputted into current deep learning neural network, is passed through Deep learning neural computing obtains the second output data.
Second judgment module 506, for judging the second output data legitimate reading corresponding with test input data Whether data are identical.
The hidden layer removing module 507, for when the second output data legitimate reading number corresponding with test input data When closing the second preset condition according to different discrepancy of quantity, by the previous hidden layer of classifier in current deep learning neural network It deletes, specifically: using the node of penultimate hidden layer as the input node of classifier in current deep learning neural network;Its In, the second preset condition includes: that the error rate of the second output data legitimate reading data corresponding with test input data is less than Or it is equal to the second preset threshold.
The output module 508, for working as the second output data legitimate reading data corresponding with test input data not When identical quantity meets the second preset condition, current deep learning neural network is exported.
It should be noted that system provided by the above embodiment is only illustrated with the division of above-mentioned each functional module Illustrate, in practical applications, can according to need and be completed by different functional modules above-mentioned function distribution, i.e., by internal junction Structure is divided into different functional modules, to complete all or part of the functions described above.
It is appreciated that term " first ", " second " used in the robot system of above-described embodiment etc. can be used for describing Various units, but these units should not be limited by these terms.These terms are only used to by first module and another block region Point.For example, without departing from the scope of the invention, first judgment module can be known as to the second judgment module, And similarly, the second judgment module can be known as to first judgment module, first judgment module and the second judgment module are both Judgment module, but it is not same judgment module.
Embodiment 3:
A kind of storage medium is present embodiments provided, which is stored with one or more programs, described program quilt When processor executes, realize that the number of plies of above-described embodiment 1 increases and decreases deep learning neural network training method, as follows:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance If when condition, exporting current deep learning neural network.
Storage medium described in the present embodiment can be the media such as ROM, RAM, disk, CD.
Embodiment 4:
A kind of calculating equipment is present embodiments provided, which includes processor and memory, and memory is stored with One or more programs when processor executes the program of memory storage, realize that the number of plies of above-described embodiment 1 increases and decreases deep learning Neural network training method, as follows:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network includes defeated Enter layer, hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated by current deep learning neural network To the first output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
When the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first When preset condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets first in advance If when condition, test input data being inputted current deep learning neural network, is obtained by deep learning neural computing Second output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
When the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second When preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets second in advance If when condition, exporting current deep learning neural network.
It is hand-held eventually that calculating equipment described in the present embodiment can be desktop computer, laptop, smart phone, PDA End, tablet computer or other terminal devices having a display function.
In conclusion the present invention will train input data to input deep learning neural network, pass through deep learning nerve net The first output data is calculated in network, if the first output data anticipated output data corresponding with training input data are not identical Discrepancy of quantity close preset condition, then increase the number of plies of hidden layer;If the first output data and training input data are corresponding pre- The different quantity of phase output data meets preset condition, then test input data is inputted deep learning neural network, passed through Deep learning neural computing obtains the second output data, if the test corresponding with test input data of the second output data The different discrepancy of quantity of output data closes preset condition, then reduces the number of plies of hidden layer, by increasing or decreasing the number of plies of hidden layer, Until just reaching abundant fitting, that is, the correction of fitting is realized, so that top layer concept just has distinguishes different samples enough The characteristic information of data, thus can be completely corresponding with anticipated output data, legitimate reading data, top when can reach abundant fitting Layer concept is exactly the concept for being just enough sufficiently to be fitted with output label.
The above, only the invention patent preferred embodiment, but the scope of protection of the patent of the present invention is not limited to This, anyone skilled in the art is in the range disclosed in the invention patent, according to the present invention the skill of patent Art scheme and its inventive concept are subject to equivalent substitution or change, belong to the scope of protection of the patent of the present invention.

Claims (10)

1. the number of plies increases and decreases deep learning neural network training method, it is characterised in that: the described method includes:
Pass through sample training current deep learning neural network;Wherein, the current deep learning neural network include input layer, Hidden layer, classifier and output layer;
Training input data is inputted into current deep learning neural network, is calculated the by current deep learning neural network One output data;
Judge whether the first output data anticipated output data corresponding with training input data are identical;
It is preset when the first output data different discrepancy of quantity of anticipated output data corresponding with training input data closes first When condition, increase a hidden layer before classifier in current deep learning neural network;
When the first output data different quantity of anticipated output data corresponding with training input data meets the first default item When part, test input data is inputted into current deep learning neural network, obtains second by deep learning neural computing Output data;
Judge whether the second output data legitimate reading data corresponding with test input data are identical;
It is preset when the second output data different discrepancy of quantity of legitimate reading data corresponding with test input data closes second When condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
When the second output data different quantity of legitimate reading data corresponding with test input data meets the second default item When part, current deep learning neural network is exported.
2. the number of plies according to claim 1 increases and decreases deep learning neural network training method, it is characterised in that: described to work as Increase a hidden layer in preceding deep learning neural network before classifier, specifically: it is with the output of the last one hidden layer and newly slotting The input of the hidden layer entered is connected by encoding and decoding network, using the output for the hidden layer being newly inserted into as current deep learning neural network The input of middle classifier.
3. the number of plies according to claim 2 increases and decreases deep learning neural network training method, it is characterised in that: described new slotting The number of nodes of the hidden layer entered is less than or equal to the number of nodes of the last one hidden layer.
4. the number of plies according to claim 1 increases and decreases deep learning neural network training method, it is characterised in that: described to work as In preceding deep learning neural network classifier previous hidden layer delete, specifically: with the node of penultimate hidden layer be work as The input node of classifier in preceding deep learning neural network.
5. the number of plies according to claim 4 increases and decreases deep learning neural network training method, it is characterised in that: the inverse The number of nodes of second hidden layer is greater than the number of nodes of the last one hidden layer.
6. the number of plies according to claim 1-5 increases and decreases deep learning neural network training method, it is characterised in that: First preset condition includes: that the error rate of the first output data anticipated output data corresponding with training input data is less than Or it is equal to the first preset threshold;
The calculating process of the error rate of first output data anticipated output data corresponding with training input data are as follows: first The output data different quantity of anticipated output data corresponding with training input data is tested divided by training input data Sum.
7. the number of plies according to claim 1-5 increases and decreases deep learning neural network training method, it is characterised in that: Second preset condition includes: that the error rate of the second output data legitimate reading data corresponding with test input data is less than Or it is equal to the second preset threshold;
The calculating process of the error rate of second output data legitimate reading data corresponding with test input data are as follows: second The output data different quantity of legitimate reading data corresponding with test input data is tested divided by test input data Sum.
8. the number of plies increases and decreases deep learning neural metwork training system, it is characterised in that: the system comprises:
Training module, for passing through sample training current deep learning neural network;Wherein, the current depth learns nerve net Network includes input layer, hidden layer, classifier and output layer;
First input module is learnt for input data will to be trained to input current deep learning neural network by current depth Neural computing obtains the first output data;
First judgment module, for judge the first output data anticipated output data corresponding with trained input data whether phase Together;
Hidden layer increases module, for when the first output data different number of anticipated output data corresponding with training input data When amount does not meet the first preset condition, increase a hidden layer before classifier in current deep learning neural network;
Second input module, for when the first output data different number of anticipated output data corresponding with training input data When amount meets the first preset condition, test input data is inputted into current deep learning neural network, passes through deep learning nerve Network query function obtains the second output data;
Second judgment module, for judge the second output data with test the corresponding legitimate reading data of input data whether phase Together;
Hidden layer removing module, for when the second output data different number of legitimate reading data corresponding with test input data When amount does not meet the second preset condition, the previous hidden layer of classifier in current deep learning neural network is deleted;
Output module, for being accorded with when the second output data different quantity of legitimate reading data corresponding with test input data When closing the second preset condition, current deep learning neural network is exported.
9. storage medium is stored with program, it is characterised in that: when described program is executed by processor, realize that claim 1-7 appoints The number of plies described in one increases and decreases deep learning neural network training method.
10. equipment is calculated, including processor and for the memory of storage processor executable program, it is characterised in that: institute When stating the program of processor execution memory storage, the described in any item numbers of plies increase and decrease deep learnings minds of claim 1-7 are realized Through network training method.
CN201810823422.7A 2018-07-25 2018-07-25 Number-of-layers-increasing deep learning neural network training method, system, medium, and device Active CN108985456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810823422.7A CN108985456B (en) 2018-07-25 2018-07-25 Number-of-layers-increasing deep learning neural network training method, system, medium, and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810823422.7A CN108985456B (en) 2018-07-25 2018-07-25 Number-of-layers-increasing deep learning neural network training method, system, medium, and device

Publications (2)

Publication Number Publication Date
CN108985456A true CN108985456A (en) 2018-12-11
CN108985456B CN108985456B (en) 2021-06-22

Family

ID=64550396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810823422.7A Active CN108985456B (en) 2018-07-25 2018-07-25 Number-of-layers-increasing deep learning neural network training method, system, medium, and device

Country Status (1)

Country Link
CN (1) CN108985456B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734308A (en) * 2021-03-10 2021-04-30 张怡然 Data collaboration system and method based on neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150134583A1 (en) * 2013-11-14 2015-05-14 Denso Corporation Learning apparatus, learning program, and learning method
CN104700153A (en) * 2014-12-05 2015-06-10 江南大学 PH (potential of hydrogen) value predicting method of BP (back propagation) neutral network based on simulated annealing optimization
US9195935B2 (en) * 2012-04-30 2015-11-24 The Regents Of The University Of California Problem solving by plastic neuronal networks
CN105787557A (en) * 2016-02-23 2016-07-20 北京工业大学 Design method of deep nerve network structure for computer intelligent identification
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN108171329A (en) * 2017-12-13 2018-06-15 华南师范大学 Deep learning neural network training method, number of plies adjusting apparatus and robot system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9195935B2 (en) * 2012-04-30 2015-11-24 The Regents Of The University Of California Problem solving by plastic neuronal networks
US20150134583A1 (en) * 2013-11-14 2015-05-14 Denso Corporation Learning apparatus, learning program, and learning method
CN104700153A (en) * 2014-12-05 2015-06-10 江南大学 PH (potential of hydrogen) value predicting method of BP (back propagation) neutral network based on simulated annealing optimization
CN105787557A (en) * 2016-02-23 2016-07-20 北京工业大学 Design method of deep nerve network structure for computer intelligent identification
CN106709511A (en) * 2016-12-08 2017-05-24 华中师范大学 Urban rail transit panoramic monitoring video fault detection method based on depth learning
CN108171329A (en) * 2017-12-13 2018-06-15 华南师范大学 Deep learning neural network training method, number of plies adjusting apparatus and robot system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
余乐安 等著: "《外汇汇率与国际原油价格波动预测》", 30 June 2006 *
夜月XL: "神经网络中隐层数和隐层节点数问题的讨论", 《CSDN:HTTPS://BLOG.CSDN.NET/U013045749/ARTICLE/DETAILS/40783281》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734308A (en) * 2021-03-10 2021-04-30 张怡然 Data collaboration system and method based on neural network
CN112734308B (en) * 2021-03-10 2023-06-27 张怡然 Data collaboration system and method based on neural network

Also Published As

Publication number Publication date
CN108985456B (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN104679863B (en) It is a kind of based on deep learning to scheme to search drawing method and system
CN108171329A (en) Deep learning neural network training method, number of plies adjusting apparatus and robot system
CN111126218B (en) Human behavior recognition method based on zero sample learning
CN109783666B (en) Image scene graph generation method based on iterative refinement
EP3961441B1 (en) Identity verification method and apparatus, computer device and storage medium
CN108388876A (en) A kind of image-recognizing method, device and relevant device
CN110188343A (en) Multi-modal emotion identification method based on fusion attention network
CN106951825A (en) A kind of quality of human face image assessment system and implementation method
CN107437077A (en) A kind of method that rotation face based on generation confrontation network represents study
CN109447099B (en) PCA (principal component analysis) dimension reduction-based multi-classifier fusion method
CN106068514A (en) For identifying the system and method for face in free media
CN108182409A (en) Biopsy method, device, equipment and storage medium
CN107506786A (en) A kind of attributive classification recognition methods based on deep learning
US10986400B2 (en) Compact video representation for video event retrieval and recognition
CN106778852A (en) A kind of picture material recognition methods for correcting erroneous judgement
CN107871107A (en) Face authentication method and device
US11823490B2 (en) Non-linear latent to latent model for multi-attribute face editing
Zhu et al. Convolutional ordinal regression forest for image ordinal estimation
CN104679967B (en) A kind of method for judging psychological test reliability
CN109271546A (en) The foundation of image retrieval Feature Selection Model, Database and search method
CN113822953A (en) Processing method of image generator, image generation method and device
CN113705596A (en) Image recognition method and device, computer equipment and storage medium
Wang et al. Learning to augment expressions for few-shot fine-grained facial expression recognition
Hu et al. Multi-perspective cost-sensitive context-aware multi-instance sparse coding and its application to sensitive video recognition
CN112101087A (en) Facial image identity de-identification method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant