CN107480773A - The method, apparatus and storage medium of training convolutional neural networks model - Google Patents

The method, apparatus and storage medium of training convolutional neural networks model Download PDF

Info

Publication number
CN107480773A
CN107480773A CN201710675297.5A CN201710675297A CN107480773A CN 107480773 A CN107480773 A CN 107480773A CN 201710675297 A CN201710675297 A CN 201710675297A CN 107480773 A CN107480773 A CN 107480773A
Authority
CN
China
Prior art keywords
hidden layer
hidden
layer
probability
hiding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710675297.5A
Other languages
Chinese (zh)
Other versions
CN107480773B (en
Inventor
万韶华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201710675297.5A priority Critical patent/CN107480773B/en
Publication of CN107480773A publication Critical patent/CN107480773A/en
Application granted granted Critical
Publication of CN107480773B publication Critical patent/CN107480773B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure is directed to a kind of method, apparatus and storage medium of training convolutional neural networks model, it is related to depth learning technology field, this method includes:Each hidden layer in the multiple hidden layers included for convolutional neural networks model, selection target node in multiple nodes that the hiding probability based on hidden layer includes from hidden layer, the hiding probability of the plurality of hidden layer differ;Convolutional neural networks model is trained based on the destination node selected from multiple hidden layers.Because different hidden layers corresponds to different input values, therefore, when using different hiding probability, selection target node is trained from different hidden layers, it is trained compared to probability is hidden using identical to all hidden layers in correlation technique, can effectively improves the image recognition accuracy rate of convolutional neural networks model.

Description

The method, apparatus and storage medium of training convolutional neural networks model
Technical field
This disclosure relates to depth learning technology field, more particularly to a kind of method of training convolutional neural networks model, dress Put and storage medium.
Background technology
In recent years, depth learning technology is widely used in image recognition and calssification field.Wherein, in depth learning technology The convolutional neural networks model of use is typically the convolutional network of multilayer.When training the convolutional neural networks model, if Sample in training set is less, is easy for causing over-fitting, so as to cause the reduction of image recognition accuracy rate.In order to solve above-mentioned ask Topic, can be trained using Dropout algorithms to convolutional neural networks model.
In correlation technique, convolutional neural networks model can include an input layer, an output layer and multiple hidden layers, Between input layer and output layer, input layer is connected the plurality of hidden layer with first hidden layer, first hidden layer it is defeated Go out input value of the value as next hidden layer adjacent thereto, the output valve of last hidden layer is then as the defeated of output layer Enter value.When being trained using Dropout algorithms to convolutional neural networks model, for every in convolutional neural networks model Individual hidden layer, can be according to selection target node in multiple nodes that predetermined probabilities include from the hidden layer, and according to more from this The destination node selected in individual hidden layer is trained to convolutional neural networks model.
The content of the invention
Caused by when being trained to overcome in correlation technique using identical predetermined probabilities to convolutional neural networks model The problem of image recognition accuracy rate of convolutional neural networks model is relatively low, the disclosure provide a kind of training convolutional neural networks model Method, apparatus and storage medium.
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of method of training convolutional neural networks model, including:
Each hidden layer in the multiple hidden layers included for convolutional neural networks model, based on the hidden of the hidden layer Hide selection target node in multiple nodes for including from the hidden layer of probability, the hiding probability of the multiple hidden layer not phase Together;
The convolutional neural networks model is trained based on the destination node selected from the multiple hidden layer.
Alternatively, mesh is selected in the multiple nodes included based on the hiding probability of the hidden layer from the hidden layer Before marking node, in addition to:
The hiding probability of each hidden layer in multiple hidden layers that the convolutional neural networks model includes is determined, it is described more The hiding probability of individual hidden layer raises successively according to the level of abstraction order from high to low of the multiple hidden layer output valve.
Alternatively, each hidden layer is hiding general in multiple hidden layers that the determination convolutional neural networks model includes Rate, including:
Each hidden layer in the multiple hidden layers included for the convolutional neural networks model, obtains the hidden layer Output valve;
Based on the output valve of the hidden layer, the hiding probability of the hidden layer is determined.
Alternatively, the output valve based on the hidden layer, the hiding probability of the hidden layer is determined, including:
Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is just whole more than 1 Number;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and multiplying for preset ratio Product, obtains target quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
Determine the m-th singular value in the ranking results, wherein, preceding M singular value in the ranking results it is flat Just and more than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is put down less than the target Fang He, the M are the positive integer more than or equal to 1;
The M and the N ratio are defined as to the hiding probability of the hidden layer.
Alternatively, each hidden layer is hiding general in multiple hidden layers that the determination convolutional neural networks model includes Rate, including:
The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, first hidden layer is the multiple The minimum hidden layer of output valve level of abstraction in hidden layer, second hidden layer are that output valve is abstracted in the multiple hidden layer Degree highest hidden layer;
Based on the output valve of first hidden layer, the hiding probability of first hidden layer is determined, based on described second The output valve of hidden layer, determine the hiding probability of second hidden layer;
Hiding probability based on first hidden layer and the probability difference between the hiding probability of second hidden layer, Determine to remove the hiding general of other hidden layers between first hidden layer and second hidden layer in the multiple hidden layer Rate.
Alternatively, first hidden layer is first hidden layer be connected with input layer, second hidden layer for Last hidden layer of output layer connection;
Probability between the hiding probability based on first hidden layer and the hiding probability of second hidden layer Difference, determine the hidden of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer Probability is hidden, including:
Number based on the multiple hidden layer, the probability difference, first hidden layer hiding probability and The hiding probability of last hidden layer, it is determined that between first hidden layer and last described hidden layer Each hidden layer hiding probability.
Alternatively, mesh is selected in the multiple nodes included based on the hiding probability of the hidden layer from the hidden layer Node is marked, including:
Each node in the multiple nodes included for the hidden layer, it is node generation one according to preset rules Individual random chance;
When the random chance is less than the hiding probability, the node is defined as destination node.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of device of training convolutional neural networks model, the dress Put including:
Selecting module, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, it is based on Selection target node in multiple nodes that the hiding probable value of the hidden layer includes from the hidden layer, the multiple hidden layer Hiding probability differ;
Training module, for based on the destination node selected from the multiple hidden layer to the convolutional neural networks mould Type is trained.
Alternatively, described device also includes:
Determining module, each hidden layer is hidden in the multiple hidden layers included for determining the convolutional neural networks model Hide probability, the hiding probability of the multiple hidden layer according to the multiple hidden layer output valve level of abstraction from high to low suitable Sequence raises successively.
Alternatively, the determining module includes:
First acquisition submodule, for each hidden in multiple hidden layers for including for the convolutional neural networks model Containing layer, the output valve of the hidden layer is obtained;
First determination sub-module, for the output valve based on the hidden layer, determine the hiding probability of the hidden layer.
Alternatively, the determination sub-module is used for:
Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is just whole more than 1 Number;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and multiplying for preset ratio Product, obtains target quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
Determine the m-th singular value in the ranking results, wherein, preceding M singular value in the ranking results it is flat Just and more than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is put down less than the target Fang He, the M are the positive integer more than or equal to 1;
The M and the N ratio are defined as to the hiding probability of the hidden layer.
Alternatively, the determining module includes:
Second acquisition submodule, it is described for obtaining the output valve of the first hidden layer and the output valve of the second hidden layer First hidden layer is the hidden layer that output valve level of abstraction is minimum in the multiple hidden layer, and second hidden layer is described more Output valve level of abstraction highest hidden layer in individual hidden layer;
Second determination sub-module, for the output valve based on first hidden layer, determine the hidden of first hidden layer Probability is hidden, based on the output valve of second hidden layer, determines the hiding probability of second hidden layer;
3rd determination sub-module, for hiding for the hiding probability based on first hidden layer and second hidden layer Probability difference between probability, determine in the multiple hidden layer in addition to first hidden layer and second hidden layer The hiding probability of other hidden layers.
Alternatively, first hidden layer is first hidden layer be connected with input layer, second hidden layer for Last hidden layer of output layer connection;
3rd determination sub-module is used for:
Number based on the multiple hidden layer, the probability difference, first hidden layer hiding probability and The hiding probability of last hidden layer, it is determined that between first hidden layer and last described hidden layer Each hidden layer hiding probability.
Alternatively, the selecting module includes:
4th determination sub-module, for each node in multiple nodes for including for the hidden layer, according to default Rule generates a random chance for the node;
5th determination sub-module, for when the random chance is less than the hiding probability, the node to be defined as Destination node.
According to the third aspect of the embodiment of the present disclosure, there is provided a kind of device of training convolutional neural networks model, the dress Put including:
Processor;
For storing the memory of processor-executable instruction;
Wherein, the processor is configured as the step of any one method described in above-mentioned first aspect.
According to the fourth aspect of the embodiment of the present disclosure, there is provided a kind of computer-readable recording medium, it is described computer-readable Instruction is stored with storage medium, any one method described in above-mentioned first aspect is realized in the instruction when being executed by processor Step.
The technical scheme provided by this disclosed embodiment can include the following benefits:For convolutional neural networks model Including multiple hidden layers in each hidden layer, the multiple nodes included by the hiding probability of the hidden layer from the hidden layer Middle selection target node, and convolutional neural networks model is trained according to the destination node selected from multiple hidden layers, Wherein, the hiding probability of the plurality of hidden layer differs.Because different hidden layers corresponds to different input values, therefore, when adopting When with different hiding probability, selection target node is trained from different hidden layers, compared in correlation technique to all Hidden layer is hidden probability using identical and is trained, and the image recognition that can effectively improve convolutional neural networks model is accurate True rate.
It should be appreciated that the general description and following detailed description of the above are only exemplary and explanatory, not The disclosure can be limited.
Brief description of the drawings
Accompanying drawing herein is merged in specification and forms the part of this specification, shows the implementation for meeting the present invention Example, and for explaining principle of the invention together with specification.
Fig. 1 is a kind of Organization Chart of convolutional neural networks model according to an exemplary embodiment.
Fig. 2 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 3 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 4 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 A are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 B are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 5 C are a kind of determining module block diagrams according to an exemplary embodiment.
Fig. 5 D are the block diagrams of another determining module according to an exemplary embodiment.
Fig. 6 is a kind of block diagram of the device of training convolutional neural networks model according to an exemplary embodiment.
Fig. 7 is a kind of block diagram of the device of training convolutional neural networks model according to an exemplary embodiment.
Embodiment
To make the purpose, technical scheme and advantage of the disclosure clearer, below in conjunction with accompanying drawing to disclosure embodiment party Formula is described in further detail.
Before detailed explanation is carried out to the embodiment of the present disclosure, the application scenarios that are first related to the embodiment of the present disclosure Introduced.
One kind that convolutional neural networks model refers to grow up on the basis of traditional multilayer neural network is for figure As the neutral net of classification and identification, relative to traditional multilayer neural network, convolution is introduced in convolutional neural networks model Algorithm and pond algorithm., it is necessary to convolutional Neural before image is classified and identified using convolutional neural networks model Network model is trained.Wherein, when being trained to convolutional neural networks model, training sample can be concentrated multiple Sample inputs the convolutional neural networks model, forward calculation is carried out, so as to obtain the defeated of each layer of the convolutional neural networks model Go out value.Afterwards, can be carried out according to the output valve of last layer of the convolutional neural networks model and the output valve of remainder layer Backwards calculation, it is updated with the parameter of the node to each layer.
During training, if the sample data quantity that training sample is concentrated is very little, then, trained convolution The fitting effect for the sample size that neural network model is concentrated to the training sample may be very accurate, still, when passing through the volume Product neural network model is to test data when being fitted, then it is possible that the excessively poor phenomenon of precision, this phenomenon are Over-fitting.If convolutional neural networks model over-fitting, then, using the image recognition accuracy rate of the convolutional neural networks model Also can be greatly reduced.
Currently, in order to prevent above-mentioned over-fitting, Dropout technologies can be used to convolutional neural networks model It is trained.Dropout technologies are proposed by the father Hinton of deep learning earliest, when using Dropout technique drills convolution god During through network model, each hidden layer in the multiple hidden layers included for the convolutional neural networks model, once During repetitive exercise, multiple destination nodes in the hidden layer can be selected by a predetermined probabilities, and to the more of selection The parameter of individual destination node is updated.And for remaining non-selected node in the hidden layer, trained in current iteration During be then considered as and hidden, that is to say, to the parameter of non-selected node in the hidden layer temporarily without renewal.When When being iterated training again by other training samples, then again through predetermined probabilities selection target node.So, due to every Node selected by secondary repetitive exercise is different from, and therefore, the convolutional neural networks model that repetitive exercise obtains each time is also It is different.Due to the node included when being trained using Dropout technologies to convolutional neural networks model to each hidden layer Accepted or rejected, therefore, reduce the coupling between each node in each hidden layer, alleviate over-fitting, Jin Erti High image recognition accuracy rate.And the method for the training convolutional neural networks model that the embodiment of the present disclosure provides can be used for State during being trained using Dropout technologies to convolutional neural networks model.
After the application scenarios of the embodiment of the present disclosure are introduced, the volume that is next related to the embodiment of the present disclosure The basic framework of product neural network model is introduced.
Fig. 1 is a kind of framework for convolutional neural networks model that the embodiment of the present disclosure provides.As shown in figure 1, the convolution is refreshing Include input layer 101, hidden layer 102-105 and output layer 106 through network model, wherein, input layer 101 and the phase of hidden layer 102 Even, hidden layer 102,103,104 and 105 is sequentially connected, and hidden layer 105 is connected with output layer 106.Wherein, input layer can be only Including a node, multiple nodes can also be included, any hidden layer in hidden layer 102-105, may each comprise multiple sections Point, output layer 106 can include a node, can also include multiple nodes.As shown in figure 1, in the disclosed embodiments, it is false If input layer only includes a node, hidden layer 102-105 includes 4 nodes, and output layer 106 can include a node.
Wherein, input layer is used for the pixel value of all pixels point for determining that the image of input includes, and by the institute of the image The pixel value for having pixel is transmitted to hidden layer 102, wherein, hidden layer 102 can be convolutional layer.When hidden layer 102 receives After the pixel value of all pixels point, first time convolution algorithm processing is carried out according to the pixel value of all pixels point received, The pixel after first layer process of convolution is obtained, and the pixel after first time process of convolution is transmitted to hidden layer 103, this When, the hidden layer 103 can be convolutional layer, sample level either pond layer., can basis when hidden layer 103 is pond layer The pixel value of pixel after first time process of convolution carries out first time pond algorithm process, after obtaining the processing of first time pondization Pixel, and the pixel after the processing of first time pondization is transmitted to next hidden layer 104.Now, hidden layer 104 can be Convolutional layer, the pixel after the hidden layer 104 can be handled according to first time pondization carry out second of convolution algorithm processing, obtained Pixel after second of process of convolution, and the pixel after second of process of convolution is transmitted to next hidden layer 105.When When hidden layer 104 is convolutional layer, hidden layer 105 can be pond layer.Hidden layer 105 can be according to second of the convolution received Pixel after processing carries out second of pond algorithm process, obtains the pixel after second of pondization processing, and will be second Pixel after pondization processing is transmitted to output layer 106.Generally, output layer 106 is full articulamentum, can be according to the place received The pixel value of pixel after reason determines that the image is belonging respectively to pre-set the probability of each classification in multiple classifications, from And obtain the classification of the image.
It should be noted that the hidden layer in convolutional neural networks model can be convolutional layer, sample level, pond layer or complete Articulamentum.The framework of the convolutional neural networks model of above-mentioned offer is only a kind of possible convolution god that the embodiment of the present disclosure provides Framework through network model, the restriction to the embodiment of the present disclosure is not formed.
After the framework of embodiment of the present disclosure application scenarios and the convolutional neural networks model being related to is introduced, Next the implementation of the training convolutional neural networks model provided the embodiment of the present disclosure carries out detailed explanation.
Fig. 2 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment, As shown in Fig. 2 the method for the training convolutional neural networks model can be used in terminal, it can be used in server, the party Method comprises the following steps:
In step 201, each hidden layer in the multiple hidden layers included for convolutional neural networks model, based on this Selection target node in multiple nodes that the hiding probability of hidden layer includes from the hidden layer, the hiding probability of the plurality of hidden layer Differ.
In step 202, the convolutional neural networks model is entered based on the destination node selected from the plurality of hidden layer Row training.
In the disclosed embodiments, each hidden layer in the multiple hidden layers included for convolutional neural networks model, Selection target node in the multiple nodes included by the hiding probability of the hidden layer from the hidden layer, and implied according to from multiple The destination node that selects is trained to convolutional neural networks model in layer, wherein, the hiding probability of the plurality of hidden layer not phase Together.Because different hidden layers corresponds to different input values, therefore, when the different hiding probability of use is from different hidden layers When selection target node is trained, instructed compared to probability is hidden using identical to all hidden layers in correlation technique Practice, can effectively improve the image recognition accuracy rate of convolutional neural networks model.
Alternatively, in multiple nodes that the hiding probability based on hidden layer includes from hidden layer before selection target node, Also include:
Determine the hiding probability of each hidden layer in multiple hidden layers that convolutional neural networks model includes, multiple hidden layers Hiding probability raised successively according to the level of abstraction order from high to low of multiple hidden layer output valves.
Optionally it is determined that in multiple hidden layers that convolutional neural networks model includes each hidden layer hiding probability, bag Include:
Each hidden layer in the multiple hidden layers included for convolutional neural networks model, obtains the output of hidden layer Value;
Output valve based on hidden layer, determine the hiding probability of hidden layer.
Alternatively, the output valve based on hidden layer, the hiding probability of hidden layer is determined, including:
Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is the positive integer more than 1;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains mesh Mark quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
The m-th singular value in ranking results is determined, wherein, the quadratic sum of the preceding M singular value in ranking results is more than Target quadratic sum, and the quadratic sum of the preceding M-1 singular value in ranking results is less than target quadratic sum, M is more than or equal to 1 Positive integer;
M and N ratio is defined as to the hiding probability of hidden layer.
Optionally it is determined that in multiple hidden layers that convolutional neural networks model includes each hidden layer hiding probability, bag Include:
The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, the first hidden layer is in multiple hidden layers The minimum hidden layer of output valve level of abstraction, the second hidden layer are that output valve level of abstraction highest implies in multiple hidden layers Layer;
Based on the output valve of the first hidden layer, the hiding probability of the first hidden layer, the output based on the second hidden layer are determined Value, determine the hiding probability of the second hidden layer;
Hiding probability based on the first hidden layer and the probability difference between the hiding probability of the second hidden layer, are determined multiple The hiding probability of other hidden layers in hidden layer in addition to the first hidden layer and the second hidden layer.
Alternatively, the first hidden layer is first hidden layer being connected with input layer, and the second hidden layer is to connect with output layer Last hidden layer connect;
Hiding probability based on the first hidden layer and the probability difference between the hiding probability of the second hidden layer, are determined multiple The hiding probability of other hidden layers in hidden layer in addition to the first hidden layer and the second hidden layer, including:
Number based on multiple hidden layers, probability difference, first hidden layer hiding probability and last is implicit The hiding probability of layer, it is determined that the hiding probability of each hidden layer between first hidden layer and last hidden layer.
Alternatively, selection target node in multiple nodes that the hiding probability based on hidden layer includes from hidden layer, including:
Each node in the multiple nodes included for hidden layer, it is that the node generates one at random according to preset rules Probability;
When random chance, which is less than, hides probability, node is defined as destination node.
Above-mentioned all optional technical schemes, can form the alternative embodiment of the disclosure according to any combination, and the disclosure is real Example is applied no longer to repeat this one by one.
When training convolutional neural networks model, multiple hidden layers for including for the convolutional neural networks model can be with Selection target node in the multiple nodes included using different hiding probability from each hidden layer, and according to the target section of selection Point is trained to the convolutional neural networks model.And before selection target node, it may be determined that each in multiple hidden layers The hiding probability of hidden layer.
It should be noted that in convolutional neural networks model, the level of abstraction of the output valve of each hidden layer is different.It is logical Often, according to the order from input layer to output layer, closer to the hidden layer of output layer, output valve closer to classification information, namely It is that level of abstraction is higher.It is and also lower closer to shape information, level of abstraction closer to the hidden layer of input layer, output valve. It for the high hidden layer of level of abstraction, can be trained with less hiding probability, and be implied for level of abstraction is lower Layer, then can be trained with larger hiding probability.
, in the disclosed embodiments, can be according to the plurality of hidden layer for the plurality of hidden layer based on foregoing description The height of the level of abstraction of output valve, the principle reduced according to the hiding probability of hidden layer with the rise of level of abstraction are come true The hiding probability of each hidden layer in fixed the plurality of hidden layer.Specifically, can be multiple to determine by two kinds of different methods The hiding probability of each hidden layer in hidden layer, and then probability is hidden according to this convolutional neural networks model is trained. Next, the method for the first the training convolutional neural networks model provided with reference to Fig. 3 the embodiment of the present disclosure is carried out detailed Explanation.
Fig. 3 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment. This method can be used in terminal or server, in the disclosed embodiments, will be explained by executive agent of terminal. , also still can be by the implementation process in following embodiments come to convolutional neural networks model when executive agent is server It is trained.As shown in figure 3, this method comprises the following steps:
In step 301, each hidden layer in the multiple hidden layers included for convolutional neural networks model, obtain hidden Output valve containing layer.
In the disclosed embodiments, when being trained to convolutional neural networks model, training sample can be concentrated Training image carries out forward calculation, that is to say, the training image is inputted from the input layer of the convolutional neural networks model, in Between multiple hidden layers calculating, identification knot to the training image is finally exported by the output layer of the convolutional neural networks model Fruit.
During the training image is carried out into forward calculation, according to the order from input layer to output layer, input layer Output valve using as the input value of first hidden layer, the output valve of last hidden layer is using as the input of output layer Value.And for two adjacent hidden layers, the output valve of previous hidden layer is using as the input value of next hidden layer.As general The training image from input layer be transferred to output layer when, for each hidden in multiple hidden layers of the convolutional neural networks model Containing layer, terminal can obtain the output valve of each hidden layer.
In step 302, the output valve based on each hidden layer, it is determined that the hiding probability of each hidden layer, multiple implicit Layer hiding probability differ, and the hiding probability of multiple hidden layers according to multiple hidden layer output valves level of abstraction from height to Low order raises successively.
After the output valve of each hidden layer is got, terminal can be by the output valve of each hidden layer, under Method is stated to determine the hiding probability of each hidden layer.Wherein, the hiding probability as subsequently includes multiple to the hidden layer Node carries out the foundation of selection.
Wherein, for each hidden layer in multiple hidden layers, the output valve based on the hidden layer, the hidden layer is determined Hiding the operation of probability can be:Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is more than 1 Positive integer;The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains target Quadratic sum;N number of singular value is ranked up according to order from big to small, obtains ranking results;Determine the M in ranking results Individual singular value, wherein, the quadratic sum of the preceding M singular value in ranking results is more than target quadratic sum, and in ranking results before The quadratic sum of M-1 singular value is less than target quadratic sum, and M is the positive integer more than or equal to 1;M and N ratio is defined as hidden Hiding probability containing layer.
It should be noted that after the output valve of the hidden layer is got, the output valve can be subjected to singular value point Solution, obtains N number of singular value.Afterwards, the quadratic sum of N number of singular value can be calculated, and calculates target according to preset ratio and puts down Fang He, for example, it is assumed that the quadratic sum of N number of singular value is R, preset ratio k, then, the target quadratic sum will be W=k*R.Its In, the preset ratio can be 80%, or 70%, can also be other numerical value.
When it is determined that after target quadratic sum, terminal can be carried out obtained N number of singular value according to order from big to small Sequence, obtains ranking results.Afterwards, terminal it is strange can to calculate this first since first singular value in the ranking results Square of different value, and whether judge the singular value square is more than or equal to target quadratic sum.If the singular value is square small In target quadratic sum, then, terminal can be by first singular value in the ranking results square plus the second singular value Square, the quadratic sum of the first two singular value in the ranking results is obtained, and judge whether the quadratic sum of the first two singular value is more than Or equal to target quadratic sum.According to above-mentioned method, terminal can constantly calculate continuous judgement, until the sequence being calculated As a result when the quadratic sum of preceding M singular value is more than or equal to target quadratic sum in, that is, stop calculate, and using M and N ratio as The hiding probability of the hidden layer.
For example, it is assumed that 10 singular values that the output valve progress singular value decomposition to hidden layer obtains, that is to say, N=10. The quadratic sum for calculating 10 singular values is R, preset ratio 80%, therefore, target quadratic sum W=80%*R.It is strange by this 10 Different value is ranked up, and obtained ranking results are:(n1, n2, n3, n4, n5, n6, n7, n8, n9, n10).Afterwards, terminal calculates firstAnd judgeWhether W is more than or equal to, ifMore than or equal to W, then, M is 1, and now, the hidden layer is hidden Probability will be M/N=0.1.IfLess than W, now, terminal can calculateAnd continue to judgeIt is whether big In or equal to W.IfLess than W, then, terminal will continue to calculateBy that analogy, until calculating When the quadratic sum arrived is more than or equal to W, that is, stop calculating.Assuming that arrived when terminal calculatesWhen, really It is fixedMore than or equal to target quadratic sum, now, you can determine M=6, and it is implicit that this is calculated The hiding probability of layer is M/N=0.6.
In step 303, selection target section in multiple nodes that the hiding probability based on hidden layer includes from the hidden layer Point.
When it is determined that hidden layer hiding probability after, each node in the multiple nodes included for the hidden layer, eventually End can be that the node generates a random chance according to preset rules, when the random chance, which is less than, hides probability, then should Node is defined as destination node.
Wherein, it is random can be defined as node generation one by Bernoulli Jacob's function either Binomial Distributing Function for terminal Probability, afterwards, by the random chance compared with the hiding probability of the hidden layer, so as to judge in follow-up training process Whether the parameter of the node is updated.
For example, it is assumed that the hidden layer includes 4 nodes, it is determined that the hiding probability of the hidden layer be 0.6.For node 1, Assuming that being 0.4 by the random chance of Bernoulli Jacob's function either node that Binomial Distributing Function determines, it is less than due to 0.4 0.6, now, you can so that the node 1 is defined as into destination node, that is to say, in follow-up training process, it is necessary to the node 1 parameter is updated.For node 2, it is assumed that its random chance is 0.7, and because 0.7 is more than 0.6, now, the node 2 will not It can be that is to say, in follow-up training process, the node 2 will be temporarily hidden, not to the node 2 by as destination node Parameter is updated.For remaining two nodes, can be determined whether by above-mentioned method as target section Point.
For each hidden layer, terminal can select at least one destination node therefrom by the above method, when true After having determined the destination node corresponding to each hidden layer, following terminal can be by the method in step 304 to the convolution Neural network model is trained.
In step 304, convolutional neural networks model is instructed based on the destination node selected from multiple hidden layers Practice.
For each hidden layer, terminal can therefrom select to obtain destination node by the above method, when terminal determines Multiple hidden layers respectively corresponding to after destination node, then can be by the destination node of selection to the convolutional neural networks mould Type is trained.
When terminal determines the destination node of multiple hidden layers, and the forward calculation in training process has been completed, eventually End can start backwards calculation.When carrying out backwards calculation, you can with the output valve according to output layer during forward calculation, to choosing The parameter for the destination node selected is updated, so as to complete the training to convolutional neural networks model.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values, Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate Rate.
It should be noted that being experimentally confirmed, determine to hide generally according to the method for above-mentioned offer in the embodiment of the present disclosure Rate, and probability training convolutional neural networks model is hidden according to this, convolutional neural networks model after using the training is carried out During image recognition, enter compared to the convolutional neural networks model for using the predetermined probabilities by 0.5 to train to obtain in correlation technique Row image recognition, the accuracy rate of identification rise to 0.92 from 0.86, and lifting effect is clearly.
The method that the first training convolutional neural networks model is described by above-described embodiment, next, Fig. 4 will be combined The method of second of the training convolutional neural networks model provided the embodiment of the present disclosure is introduced.
Fig. 4 is a kind of flow chart of the method for training convolutional neural networks model according to an exemplary embodiment, This method can be used in terminal or server, will be explained in the embodiment of the present disclosure by terminal of executive agent, In actual applications, also still can be by the implementation process in following embodiments come to volume when executive agent is server Product neural network model is trained.As shown in figure 4, this method comprises the following steps:
In step 401, the output valve of the first hidden layer and the output valve of the second hidden layer, first hidden layer are obtained For the hidden layer that output valve level of abstraction in multiple hidden layers is minimum, second hidden layer is that output valve is abstracted in multiple hidden layers Degree highest hidden layer.
In the disclosed embodiments, in order to reduce amount of calculation, terminal can only minimum to output valve level of abstraction first The output valve of hidden layer and output valve level of abstraction the second hidden layer of highest is analyzed, to obtain corresponding hide generally respectively Rate, and for other hidden layers, then its output valve can not be analyzed, determine that it is hidden using easier algorithm Probability.Therefore, in this step, terminal can only obtain the output valve of the first hidden layer and the output valve of the second hidden layer.
It should be noted that in current convolutional neural networks model, generally, first be connected with input layer is implicit The level of abstraction of the output valve of layer is minimum, and the level of abstraction of the output valve for last hidden layer being connected with output layer Often highest, therefore, terminal can directly obtain the output valve for first hidden layer being connected with input layer as first The output valve of hidden layer, and the output using the output valve for last hidden layer being connected with output layer as the second hidden layer Value.
In step 402, the output valve based on the first hidden layer, the hiding probability of the first hidden layer is determined, based on second The output valve of hidden layer, determine the hiding probability of the second hidden layer.
When getting the output valve of the first hidden layer, terminal can be by way of in step 302 in previous embodiment To determine to obtain the hiding probability of the first hidden layer.Similarly, when obtaining the hiding probability of the second hidden layer, terminal similarly may be used The hiding probability of the second hidden layer is determined in a manner of by determining to hide probability in step 302.The embodiment of the present disclosure is at this In repeat no more.
In step 403, the probability between the hiding probability based on the first hidden layer and the hiding probability of the second hidden layer Difference, determine the hiding probability of other hidden layers in multiple hidden layers in addition to the first hidden layer and the second hidden layer.
After it is determined that obtaining the hiding probability of the first hidden layer and the second hidden layer, terminal can calculate the first hidden layer Hiding probability and the second hidden layer hiding probability between probability difference;Afterwards, terminal can be based on the probability difference come Determine the hiding probability of remaining hidden layer in the plurality of hidden layer.
Wherein, terminal can be arranged the level of abstraction of the output valve of multiple hidden layers according to order from low to high Sequence, ranking results are obtained, now, first hidden layer in the ranking results is the first hidden layer, and in the ranking results Last hidden layer be the second hidden layer.Afterwards, terminal can the number based on multiple hidden layers, probability difference, The hiding probability of one hidden layer and the hiding probability of the second hidden layer, determine in the ranking results positioned at first hidden layer and The hiding probability of each hidden layer between last hidden layer.
Wherein, it is calculated when terminal general between the hiding probability of the first hidden layer and the hiding probability of the second hidden layer After rate difference, terminal can determine the number K for the hidden layer that the convolutional neural networks model includes, and calculate the probability difference Ratio between K-1.Afterwards, the hiding probability of the first hidden layer can be subtracted the ratio by terminal, obtain the ranking results In second hidden layer hiding probability, by that analogy.It that is to say, second for being located at the first hidden layer in the ranking results Each hidden layer between hidden layer, the hiding probability of the hidden layer can be subtracted the ratio, so as to obtain the ranking results In be located at the hidden layer after next hidden layer hiding probability.
It should be noted that in current convolutional neural networks model, the level of abstraction of the output valve of multiple hidden layers It is often elevated successively according to the order from input layer to output layer, it that is to say, first be connected with input layer hidden layer The level of abstraction of output valve be minimum, the level of abstraction of the output valve for the next hidden layer being connected with first hidden layer Then it is higher than first hidden layer, by that analogy, the level of abstraction of the output valve for last hidden layer being connected with output layer is most It is high.Therefore, may not necessarily be implied for such convolutional neural networks model, terminal according still further to the level of abstraction of output valve to multiple Layer is ranked up.That is to say, terminal can directly determine first hidden layer being connected with input layer hiding probability and with The hiding probability of last hidden layer of output layer connection, and calculate between first hidden layer and last hidden layer Probability difference, afterwards, the number based on the plurality of hidden layer, the probability difference, the hiding probability of first hidden layer and last The hiding probability of one hidden layer, calculate in the convolutional neural networks model positioned at first hidden layer and last hidden layer Between each hidden layer hiding probability.
For example, it is assumed that the hiding probability for first hidden layer being connected in convolutional neural networks model with input layer is a, with The hiding probability of last hidden layer of output layer connection is b, the number for the hidden layer that the convolutional neural networks model includes For 10, then, terminal can be calculated between the hiding probability a of first hidden layer and the hiding probability b of last hidden layer Probability difference a-b, and ratio (a-b)/(K-1) is calculated.Afterwards, it is next hidden for being connected with first hidden layer Containing layer, the hiding probability of the hidden layer is a- (a-b)/(K-1), and the next hidden layer being connected with the hidden layer is hiding Probability is then a-2* (a-b)/(K-1), by that analogy, can be calculated positioned at first hidden layer and last hidden layer Between each hidden layer hiding probability.
In step 404, the multiple sections included for each hidden layer, the hiding probability based on the hidden layer from hidden layer Selection target node in point.
After the hiding probability of each hidden layer is determined, for each hidden layer, terminal may be referred to foregoing implementation Step 303 in example, selection target node in the multiple nodes included by the hiding probability of the hidden layer from the hidden layer.This Open embodiment repeats no more herein.
In step 405, convolutional neural networks model is instructed based on the destination node selected from multiple hidden layers Practice.
This step may be referred to the implementation in step 304, the embodiment of the present disclosure in previous embodiment and repeat no more.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values, Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate Rate.
In addition, in the present embodiment, terminal can only analysis output valve level of abstraction highest hidden layer output valve with And the output valve of the minimum hidden layer of level of abstraction, remaining hidden layer may not necessarily then do output valve analysis again, according to hiding The principle that probability is raised and reduced with level of abstraction is determined, and reduces the amount of calculation of terminal.
It is next right after the method for the training convolutional neural networks model provided the embodiment of the present disclosure is introduced The device for the training convolutional neural networks model that the embodiment of the present disclosure provides is introduced.
Fig. 5 A are a kind of block diagrams of the device of training convolutional neural networks model according to an exemplary embodiment.Ginseng According to Fig. 5 A, the device includes determining module 501, selecting module 502 and training module 503.
Selecting module 501, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, base The selection target node in multiple nodes that the hiding probable value of the hidden layer includes from the hidden layer, multiple hidden layers are hidden Probability differs;
Training module 502, for being entered based on the destination node selected from multiple hidden layers to convolutional neural networks model Row training.
Alternatively, also include referring to Fig. 5 B, the device:
Determining module 503, each hidden layer is hidden in the multiple hidden layers included for determining convolutional neural networks model Hide probability, the hiding probability of multiple hidden layers rises successively according to the level of abstraction order from high to low of multiple hidden layer output valves It is high.
Alternatively, include referring to Fig. 5 C, the determining module 503:
First acquisition submodule 5031, for each hidden in multiple hidden layers for including for convolutional neural networks model Containing layer, the output valve of hidden layer is obtained;
First determination sub-module 5032, for the output valve based on hidden layer, determine the hiding probability of hidden layer.
Alternatively, the first determination sub-module is used for:
Singular value decomposition is carried out to the output valve of hidden layer, obtains N number of singular value, N is the positive integer more than 1;
The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, obtains mesh Mark quadratic sum;
N number of singular value is ranked up according to order from big to small, obtains ranking results;
The m-th singular value in ranking results is determined, wherein, the quadratic sum of the preceding M singular value in ranking results is more than Target quadratic sum, and the quadratic sum of the preceding M-1 singular value in ranking results is less than target quadratic sum, M is more than or equal to 1 Positive integer;
M and N ratio is defined as to the hiding probability of hidden layer.
Alternatively, include referring to Fig. 5 D, the determining module 503:
Second acquisition submodule 5033, for obtaining the output valve of the first hidden layer and the output valve of the second hidden layer, First hidden layer is the hidden layer that output valve level of abstraction is minimum in multiple hidden layers, and the second hidden layer is defeated in multiple hidden layers Go out to be worth level of abstraction highest hidden layer;
Second determination sub-module 5034, for the output valve based on the first hidden layer, determine the hiding general of the first hidden layer Rate, based on the output valve of the second hidden layer, determine the hiding probability of the second hidden layer;
3rd determination sub-module 5035, for hiding probability and the hiding probability of the second hidden layer based on the first hidden layer Between probability difference, determine hiding for other hidden layers in multiple hidden layers in addition to the first hidden layer and the second hidden layer Probability.
Alternatively, the first hidden layer is first hidden layer being connected with input layer, and the second hidden layer is to connect with output layer Last hidden layer connect;
3rd determination sub-module is used for:
Number based on multiple hidden layers, probability difference, first hidden layer hiding probability and last is implicit The hiding probability of layer, it is determined that the hiding probability of each hidden layer between first hidden layer and last hidden layer.
Alternatively, selecting module 502 includes:
4th determination sub-module, for each node in multiple nodes for including for hidden layer, according to preset rules A random chance is generated for node;
5th determination sub-module, for when random chance is less than and hides probability, the node to be defined as into destination node.
In the disclosed embodiments, terminal can determine each hidden in multiple hidden layers that convolutional neural networks model includes Hiding probability containing layer, wherein, the hiding probability of the plurality of hidden layer differs, also, according to the plurality of hidden layer output valve Level of abstraction order from low to high, the hiding probability of the plurality of hidden layer reduces successively.Due to according to from input layer to defeated Go out the order of layer, the level of abstraction of the output valve of each hidden layer is different.Therefore, terminal can be according to corresponding to each hidden layer Level of abstraction, different hiding probability is determined for different hidden layers.And because different hidden layers corresponds to different input values, Therefore when using different hiding probability convolutional neural networks model is trained when, compared in correlation technique to all hidden Probability is hidden containing layer using identical to be trained, the image recognition that can effectively improve convolutional neural networks model is accurate Rate.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant this method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 6 is a kind of device 600 for training convolutional neural networks model according to an exemplary embodiment Block diagram.For example, device 600 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, Tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Reference picture 6, device 600 can include following one or more assemblies:Processing component 602, memory 604, power supply Component 606, multimedia groupware 608, audio-frequency assembly 610, the interface 612 of input/output (I/O), sensor cluster 614, and Communication component 616.
The integrated operation of the usual control device 600 of processing component 602, such as communicated with display, call, data, phase The operation that machine operates and record operation is associated.Processing component 602 can refer to including one or more processors 620 to perform Order, to complete all or part of step of above-mentioned method.In addition, processing component 602 can include one or more modules, just Interaction between processing component 602 and other assemblies.For example, processing component 602 can include multi-media module, it is more to facilitate Interaction between media component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in device 600.These data are shown Example includes the instruction of any application program or method for being operated on device 600, contact data, telephone book data, disappears Breath, picture, video etc..Memory 604 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) are erasable to compile Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 606 provides power supply for the various assemblies of device 600.Power supply module 606 can include power management system System, one or more power supplys, and other components associated with generating, managing and distributing power supply for device 600.
Multimedia groupware 608 is included in the screen of one output interface of offer between described device 600 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action Border, but also detect and touched or the related duration and pressure of slide with described.In certain embodiments, more matchmakers Body component 608 includes a front camera and/or rear camera.When device 600 is in operator scheme, such as screening-mode or During video mode, front camera and/or rear camera can receive outside multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio-frequency assembly 610 is configured as output and/or input audio signal.For example, audio-frequency assembly 610 includes a Mike Wind (MIC), when device 600 is in operator scheme, during such as call model, logging mode and speech recognition mode, microphone by with It is set to reception external audio signal.The audio signal received can be further stored in memory 604 or via communication set Part 616 is sent.In certain embodiments, audio-frequency assembly 610 also includes a loudspeaker, for exports audio signal.
I/O interfaces 612 provide interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock Determine button.
Sensor cluster 614 includes one or more sensors, and the state for providing various aspects for device 600 is commented Estimate.For example, sensor cluster 614 can detect opening/closed mode of device 600, and the relative positioning of component, for example, it is described Component is the display and keypad of device 600, and sensor cluster 614 can be with 600 1 components of detection means 600 or device Position change, the existence or non-existence that user contacts with device 600, the orientation of device 600 or acceleration/deceleration and device 600 Temperature change.Sensor cluster 614 can include proximity transducer, be configured to detect in no any physical contact The presence of neighbouring object.Sensor cluster 614 can also include optical sensor, such as CMOS or ccd image sensor, for into As being used in application.In certain embodiments, the sensor cluster 614 can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device 600 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary implementation In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 616 also includes near-field communication (NFC) module, to promote junction service.Example Such as, in NFC module radio frequency identification (RFID) technology can be based on, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 600 can be believed by one or more application specific integrated circuits (ASIC), numeral Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for performing above-mentioned Fig. 2-4 illustrated embodiments The method of offer.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 604 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 620 of device 600.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is held by the processor of terminal During row so that the training convolutional neural networks model that terminal is able to carry out a kind of above-mentioned Fig. 2, Fig. 3 and embodiment illustrated in fig. 4 provides Method.
Fig. 7 is a kind of device 700 for training convolutional neural networks model according to an exemplary embodiment Block diagram.For example, device 700 may be provided in a server.Reference picture 7, device 700 include processor 722, and it is further wrapped One or more processors are included, and as the memory resource representated by memory 732, can be by processor 722 for storing The instruction of execution, such as application program.The application program stored in memory 732 can include one or more each The individual module for corresponding to one group of instruction.In addition, processor 722 is configured as execute instruction, implemented with performing shown in above-mentioned Fig. 2-4 The method that example provides.
Device 700 can also include the power management that a power supply module 726 is configured as performs device 700, and one has Line or radio network interface 750 are configured as device 700 being connected to network, and input and output (I/O) interface 758.Dress Putting 700 can operate based on the operating system for being stored in memory 732, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 732 of instruction, above-mentioned instruction can be performed to complete the above method by the processor 722 of device 700.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processor of server During execution so that the training convolutional neural networks model that server is able to carry out above-mentioned Fig. 2, Fig. 3 and embodiment illustrated in fig. 4 provides Method.
Those skilled in the art will readily occur to the present invention its after considering specification and putting into practice invention disclosed herein Its embodiment.The application be intended to the present invention any modification, purposes or adaptations, these modifications, purposes or Person's adaptations follow the general principle of the present invention and including the undocumented common knowledges in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as exemplary, and true scope and spirit of the invention are by following Claim is pointed out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and And various modifications and changes can be being carried out without departing from the scope.The scope of the present invention is only limited by appended claim.

Claims (16)

  1. A kind of 1. method of training convolutional neural networks model, it is characterised in that methods described includes:
    Each hidden layer in the multiple hidden layers included for convolutional neural networks model, it is hiding general based on the hidden layer Selection target node in multiple nodes that rate includes from the hidden layer, the hiding probability of the multiple hidden layer differ;
    The convolutional neural networks model is trained based on the destination node selected from the multiple hidden layer.
  2. 2. according to the method for claim 1, it is characterised in that it is described based on the hiding probability of the hidden layer from described hidden In the multiple nodes included containing layer before selection target node, in addition to:
    The hiding probability of each hidden layer in multiple hidden layers that the convolutional neural networks model includes is determined, it is the multiple hidden Hiding probability containing layer raises successively according to the level of abstraction order from high to low of the multiple hidden layer output valve.
  3. 3. according to the method for claim 2, it is characterised in that the determination convolutional neural networks model includes more The hiding probability of each hidden layer in individual hidden layer, including:
    Each hidden layer in the multiple hidden layers included for the convolutional neural networks model, obtains the defeated of the hidden layer Go out value;
    Based on the output valve of the hidden layer, the hiding probability of the hidden layer is determined.
  4. 4. according to the method for claim 3, it is characterised in that the output valve based on the hidden layer, it is determined that described The hiding probability of hidden layer, including:
    Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is the positive integer more than 1;
    The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, is obtained To target quadratic sum;
    N number of singular value is ranked up according to order from big to small, obtains ranking results;
    The m-th singular value in the ranking results is determined, wherein, the quadratic sum of the preceding M singular value in the ranking results More than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is less than the target quadratic sum, The M is the positive integer more than or equal to 1;
    The M and the N ratio are defined as to the hiding probability of the hidden layer.
  5. 5. according to the method for claim 2, it is characterised in that the determination convolutional neural networks model includes more The hiding probability of each hidden layer in individual hidden layer, including:
    The output valve of the first hidden layer and the output valve of the second hidden layer are obtained, first hidden layer is the multiple implicit The minimum hidden layer of output valve level of abstraction in layer, second hidden layer are output valve level of abstraction in the multiple hidden layer Highest hidden layer;
    Based on the output valve of first hidden layer, the hiding probability of first hidden layer is determined, it is implicit based on described second The output valve of layer, determine the hiding probability of second hidden layer;
    Hiding probability based on first hidden layer and the probability difference between the hiding probability of second hidden layer, it is determined that The hiding probability of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer.
  6. 6. according to the method for claim 5, it is characterised in that first hidden layer is first be connected with input layer Hidden layer, second hidden layer are last hidden layer being connected with output layer;
    Probability difference between the hiding probability based on first hidden layer and the hiding probability of second hidden layer, Determine the hiding general of other hidden layers in the multiple hidden layer in addition to first hidden layer and second hidden layer Rate, including:
    Number based on the multiple hidden layer, the probability difference, the hiding probability of first hidden layer and described The hiding probability of last hidden layer, it is determined that every between first hidden layer and last described hidden layer The hiding probability of individual hidden layer.
  7. 7. according to any described methods of claim 1-6, it is characterised in that it is described based on the hiding probability of the hidden layer from Selection target node in multiple nodes that the hidden layer includes, including:
    Each node in the multiple nodes included for the hidden layer, according to preset rules be the node generate one with Machine probability;
    When the random chance is less than the hiding probability, the node is defined as destination node.
  8. 8. a kind of device of training convolutional neural networks model, it is characterised in that described device includes:
    Selecting module, for each hidden layer in multiple hidden layers for including for convolutional neural networks model, based on described Selection target node in multiple nodes that the hiding probability of hidden layer includes from the hidden layer, the multiple hidden layer are hidden Probability differs;
    Training module, for being entered based on the destination node selected from the multiple hidden layer to the convolutional neural networks model Row training.
  9. 9. device according to claim 8, it is characterised in that described device also includes:
    Determining module, each hidden layer is hiding general in the multiple hidden layers included for determining the convolutional neural networks model Rate, the hiding probability of the multiple hidden layer according to the multiple hidden layer output valve level of abstraction order from high to low according to Secondary rise.
  10. 10. device according to claim 9, it is characterised in that the determining module includes:
    First acquisition submodule, for each implicit in multiple hidden layers for including for the convolutional neural networks model Layer, obtain the output valve of the hidden layer;
    First determination sub-module, for the output valve based on the hidden layer, determine the hiding probability of the hidden layer.
  11. 11. device according to claim 10, it is characterised in that first determination sub-module is used for:
    Singular value decomposition is carried out to the output valve of the hidden layer, obtains N number of singular value, the N is the positive integer more than 1;
    The quadratic sum of N number of singular value is calculated, and calculates the quadratic sum of N number of singular value and the product of preset ratio, is obtained To target quadratic sum;
    N number of singular value is ranked up according to order from big to small, obtains ranking results;
    The m-th singular value in the ranking results is determined, wherein, the quadratic sum of the preceding M singular value in the ranking results More than the target quadratic sum, and the quadratic sum of the preceding M-1 singular value in the ranking results is less than the target quadratic sum, The M is the positive integer more than or equal to 1;
    The M and the N ratio are defined as to the hiding probability of the hidden layer.
  12. 12. device according to claim 9, it is characterised in that the determining module includes:
    Second acquisition submodule, for obtain the first hidden layer output valve and the second hidden layer output valve, described first Hidden layer is the hidden layer that output valve level of abstraction is minimum in the multiple hidden layer, and second hidden layer is the multiple hidden Containing output valve level of abstraction highest hidden layer in layer;
    Second determination sub-module, for the output valve based on first hidden layer, determine the hiding general of first hidden layer Rate, based on the output valve of second hidden layer, determine the hiding probability of second hidden layer;
    3rd determination sub-module, for the hiding probability based on first hidden layer and the hiding probability of second hidden layer Between probability difference, determine other in the multiple hidden layer in addition to first hidden layer and second hidden layer The hiding probability of hidden layer.
  13. 13. device as claimed in claim 12, it is characterised in that first hidden layer is first be connected with input layer Hidden layer, second hidden layer are last hidden layer being connected with output layer;
    3rd determination sub-module is used for:
    Number based on the multiple hidden layer, the probability difference, the hiding probability of first hidden layer and described The hiding probability of last hidden layer, it is determined that every between first hidden layer and last described hidden layer The hiding probability of individual hidden layer.
  14. 14. according to any described devices of claim 8-13, it is characterised in that the selecting module includes:
    4th determination sub-module, for each node in multiple nodes for including for the hidden layer, according to preset rules A random chance is generated for the node;
    5th determination sub-module, for when the random chance is less than the hiding probability, the node to be defined as into target Node.
  15. 15. a kind of device of training convolutional neural networks model, it is characterised in that described device includes:
    Processor;
    For storing the memory of processor-executable instruction;
    Wherein, the processor is configured as the step of any one method described in claim 1-7.
  16. 16. a kind of computer-readable recording medium, instruction is stored with the computer-readable recording medium, it is characterised in that The step of instruction realizes any one method described in claim 1-7 when being executed by processor.
CN201710675297.5A 2017-08-09 2017-08-09 Method and device for training convolutional neural network model and storage medium Active CN107480773B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710675297.5A CN107480773B (en) 2017-08-09 2017-08-09 Method and device for training convolutional neural network model and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710675297.5A CN107480773B (en) 2017-08-09 2017-08-09 Method and device for training convolutional neural network model and storage medium

Publications (2)

Publication Number Publication Date
CN107480773A true CN107480773A (en) 2017-12-15
CN107480773B CN107480773B (en) 2020-11-13

Family

ID=60598970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710675297.5A Active CN107480773B (en) 2017-08-09 2017-08-09 Method and device for training convolutional neural network model and storage medium

Country Status (1)

Country Link
CN (1) CN107480773B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766493A (en) * 2018-12-24 2019-05-17 哈尔滨工程大学 A kind of cross-domain recommended method combining personality characteristics under neural network
CN110188789A (en) * 2019-04-16 2019-08-30 浙江工业大学 A kind of small sample classification method of medical image based on pretreated model
CN113496282A (en) * 2020-04-02 2021-10-12 北京金山数字娱乐科技有限公司 Model training method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517122A (en) * 2014-12-12 2015-04-15 浙江大学 Image target recognition method based on optimized convolution architecture
CN104751842A (en) * 2013-12-31 2015-07-01 安徽科大讯飞信息科技股份有限公司 Method and system for optimizing deep neural network
CN104850836A (en) * 2015-05-15 2015-08-19 浙江大学 Automatic insect image identification method based on depth convolutional neural network
CN104850845A (en) * 2015-05-30 2015-08-19 大连理工大学 Traffic sign recognition method based on asymmetric convolution neural network
CN105512676A (en) * 2015-11-30 2016-04-20 华南理工大学 Food recognition method at intelligent terminal
CN106250921A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Image processing method and device
CN106250911A (en) * 2016-07-20 2016-12-21 南京邮电大学 A kind of picture classification method based on convolutional neural networks
CN106548201A (en) * 2016-10-31 2017-03-29 北京小米移动软件有限公司 The training method of convolutional neural networks, image-recognizing method and device
CN106951848A (en) * 2017-03-13 2017-07-14 平安科技(深圳)有限公司 The method and system of picture recognition

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751842A (en) * 2013-12-31 2015-07-01 安徽科大讯飞信息科技股份有限公司 Method and system for optimizing deep neural network
CN104517122A (en) * 2014-12-12 2015-04-15 浙江大学 Image target recognition method based on optimized convolution architecture
CN104850836A (en) * 2015-05-15 2015-08-19 浙江大学 Automatic insect image identification method based on depth convolutional neural network
CN104850845A (en) * 2015-05-30 2015-08-19 大连理工大学 Traffic sign recognition method based on asymmetric convolution neural network
CN105512676A (en) * 2015-11-30 2016-04-20 华南理工大学 Food recognition method at intelligent terminal
CN106250911A (en) * 2016-07-20 2016-12-21 南京邮电大学 A kind of picture classification method based on convolutional neural networks
CN106250921A (en) * 2016-07-26 2016-12-21 北京小米移动软件有限公司 Image processing method and device
CN106548201A (en) * 2016-10-31 2017-03-29 北京小米移动软件有限公司 The training method of convolutional neural networks, image-recognizing method and device
CN106951848A (en) * 2017-03-13 2017-07-14 平安科技(深圳)有限公司 The method and system of picture recognition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MOHAMED ELLEUCH等: "《A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition》", 《PROCEDIA COMPUTER SCIENCE》 *
黄斌等: "《基于深度卷积神经网络的物体识别算法》", 《计算机应用》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766493A (en) * 2018-12-24 2019-05-17 哈尔滨工程大学 A kind of cross-domain recommended method combining personality characteristics under neural network
CN109766493B (en) * 2018-12-24 2022-08-02 哈尔滨工程大学 Cross-domain recommendation method combining personality characteristics under neural network
CN110188789A (en) * 2019-04-16 2019-08-30 浙江工业大学 A kind of small sample classification method of medical image based on pretreated model
CN113496282A (en) * 2020-04-02 2021-10-12 北京金山数字娱乐科技有限公司 Model training method and device

Also Published As

Publication number Publication date
CN107480773B (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN108256555B (en) Image content identification method and device and terminal
CN108121952A (en) Face key independent positioning method, device, equipment and storage medium
CN106446782A (en) Image identification method and device
CN108171254A (en) Image tag determines method, apparatus and terminal
CN107798669A (en) Image defogging method, device and computer-readable recording medium
CN108664989A (en) Image tag determines method, apparatus and terminal
US10701315B2 (en) Video communication device and video communication method
CN107220667A (en) Image classification method, device and computer-readable recording medium
CN106202330A (en) The determination methods of junk information and device
CN109446961B (en) Gesture detection method, device, equipment and storage medium
CN108009600A (en) Model optimization, quality determining method, device, equipment and storage medium
CN106548468B (en) The method of discrimination and device of image definition
CN104243814B (en) Analysis method, image taking reminding method and the device of objects in images layout
CN107492115A (en) The detection method and device of destination object
CN107145904A (en) Determination method, device and the storage medium of image category
CN110009090A (en) Neural metwork training and image processing method and device
CN107133354B (en) Method and device for acquiring image description information
CN107527024A (en) Face face value appraisal procedure and device
CN107845062A (en) image generating method and device
CN107943266A (en) power consumption control method, device and equipment
CN110443366A (en) Optimization method and device, object detection method and the device of neural network
CN107784279A (en) Method for tracking target and device
CN106203306A (en) The Forecasting Methodology at age, device and terminal
CN109635920A (en) Neural network optimization and device, electronic equipment and storage medium
CN107590534A (en) Train the method, apparatus and storage medium of depth convolutional neural networks model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant