CN109583485A

CN109583485A - It is a kind of that supervision deep learning method is had based on feedback training

Info

Publication number: CN109583485A
Application number: CN201811367393.4A
Authority: CN
Inventors: 杨俊杰; 郑军
Original assignee: Jushi Technology (shanghai) Co Ltd
Current assignee: Jushi Technology (shanghai) Co Ltd
Priority date: 2018-11-16
Filing date: 2018-11-16
Publication date: 2019-04-05
Anticipated expiration: 2038-11-16
Also published as: CN109583485B

Abstract

The present invention relates to a kind of to have supervision deep learning method based on feedback training, this method is during training has supervision deep learning model, when each iteration starts, each sample in training set is sampled with a sampled probability, the sampled probability is adjusted with the prediction penalty values dynamic of each sample.Compared with prior art, the present invention passes through the prediction penalty values of sample each in supervised learning training process are associated by sample frequency with its own, the probability that adjustment outliers are sampled is realized in the way of feedback training, has many advantages, such as to improve training effect.

Description

It is a kind of that supervision deep learning method is had based on feedback training

Technical field

The present invention relates to deep learning fields, have supervision deep learning side based on feedback training more particularly, to a kind of Method.

Background technique

Present has supervision deep learning method in use, needs to be learnt using a large amount of sample data, Demand to hardware when in order to reduce supervision deep learning model training generallys use small lot sampling or single sample input Mode training pattern.Common sample mode is that uniform sampling or use sequentially input.

In that case, a large amount of conventional sample has equal probability with a small amount of outliers and is admitted to model instruction Practice, model is caused to be difficult to acquire the spatial distribution of a small amount of outliers.When the training objective of model needs to detect or identify in a small amount When outliers, the accuracy rate of model is not only reduced by the Training of routine sampling mode, and reduces model Training speed.

To solve the above-mentioned problems, existing settling mode is usually data resampling, classification aligned sample, cost-sensitive Matrix and the mode of the method for cost-sensitive vector are trained.The mode of resampling and classification aligned sample is by inhomogeneity The identical number of other specimen sample is trained.Class inherited is big, small different types of of difference in class solving for this method Effect is preferable on sample size imbalance problem.However difference is big in the class, that is, when there are a small amount of outliers, extremely difficult of model To its sample distribution.The method of cost-sensitive matrix or cost-sensitive vector can pass through building confusion matrix or cost-sensitive square Battle array increases learning rate to by the classification of mistake point, thus study of the acceleration model to outliers.But when outliers are present in greatly When in sample size classification, since the probability that outliers are pumped to is very little, the effect of this method almost be can be ignored.

Therefore, in order to promote the learning efficiencies of outliers, sample size is unbalanced between not only solving the problems, such as class, and And to solve the problems, such as that sample size is unbalanced in class.And the prior art is difficult to solve the above problems.

Summary of the invention

It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind to be based on feedback training Have supervision deep learning method.

The purpose of the present invention can be achieved through the following technical solutions:

A kind of to have supervision deep learning method based on feedback training, this method has supervision deep learning model in training In the process, when each iteration starts, each sample in training set is sampled with a sampled probability, the sampled probability with The prediction penalty values dynamic of each sample adjusts.

Further, the process of the sampled probability dynamic adjustment specifically includes:

1) each sample weight parameter is initialized；

2) according to various kinds should have before weight parameter calculate corresponding sampled probability:

Wherein, P (i) is the sampled probability of sample i, and α is priority factor, p_iFor the weight parameter of sample i；

3) after carrying out an iteration, the prediction penalty values of each sample are obtained, update weight ginseng based on the prediction penalty values Number；

4) when next iteration starts, p is enabled_i=p (i), return step 2).

Further, when each sample weight parameter of the initialization, enabling each sample weight parameter is 1.

It is further, described to update weight parameter based on the prediction penalty values specifically:

P (i)=| δ (i) |+ε

Wherein, p (i) is the weight parameter of updated sample i, and δ (i) is the prediction penalty values of sample i, and ε is modifying factor Son.

Further, the modifying factor ε is a positive number for being greater than 0.

Further, the expression formula of prediction penalty values δ (i) are as follows:

δ (i)=L (y_i,f(x_i))

Wherein, x_iFor input, y_iFor x_iCorresponding true value label, function f are by inputting x_iThe function of prediction label, letter Number L is to calculate true value label y_iWith prediction label f (x_i) difference loss function.

Further, when the update weight parameter based on the prediction penalty values, weight parameter and prediction penalty values It is reciprocal directly proportional.

Compared with prior art, the present invention have with following the utility model has the advantages that

First, present invention firstly provides the methods that dynamic sampling rate adjusting is used in having supervision deep learning, pass through Increase the probability that outliers are learnt so that model quickly acquires whole sample space distribution, to reduce model training Simultaneously model training effect can be improved in time.

Second, the present invention can be in conjunction with other sample mode (sides such as resampling, classification aligned sample, cost-sensitive matrix Formula) to reach the better training effect of effect.

Third, the present invention can be used inversely, increase model learning routine by reducing the probability that outliers are sampled The ability of sample characteristics.

Detailed description of the invention

Fig. 1 is the flow diagram that present invention training has supervision deep learning model.

Specific embodiment

The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to Following embodiments.

The present invention provide it is a kind of based on feedback training having supervision deep learning method, run in GPU, be applied to image Treatment process, this method is during training has supervision deep learning model, when each iteration starts, with a sampled probability Each sample in training set is sampled, the sampled probability is adjusted with the prediction penalty values dynamic of each sample.

The process of sampled probability dynamic adjustment specifically includes:

1) each sample weight parameter p is initialized_i=1；

Wherein, P (i) is the sampled probability of sample i, p_iFor the weight parameter of sample i, α is priority factor, value more it is big then It is bigger to represent priority, is uniform sampling when α takes 0；

3) after carrying out an iteration, the prediction penalty values of each sample are obtained, update weight parameter:

P (i)=| δ (i) |+ε

Wherein, p (i) is the weight parameter of updated sample i, and δ (i) is the prediction penalty values of sample i, and ε is modifying factor Son can take 10^-5Etc. very littles normal number, x when preventing δ (i)=0₀It will not be sampled again；

4) when next iteration starts, p is enabled_i=p (i), return step 2).

Predict the expression formula of penalty values δ (i) are as follows:

δ (i)=L (y_i,f(x_i))

The above method can combine other sample mode (such as resampling, classification aligned sample, cost-sensitive matrix sides Formula) to reach the better training effect of effect.For in conjunction with classification aligned sample, from a large amount of sample class and a small amount of samples The sample of same amount is acquired in this class respectively, acquisition probability presses weighted value calculating in class.

The above method can be used inversely, increase model learning routine sample by reducing the probability that outliers are sampled The ability of feature.Such as when using autocoder (Auto-encoder), need to learn more arm's length standard samples Feature needs to sample more normal samples at this time, by peeling off using the inverse of penalty values as when itself weight calculation probability Sample can be sampled less.

As shown in Figure 1, having supervision deep learning model training process based on above-mentioned sampled probability dynamic adjustment specifically:

In step 401, the information of the corresponding tag along sort of all picture samples is read in advance；

In step 402, weight initialization, initialization value 1 are acquired to the image pattern information of all readings；

In step 403, the acquisition probability of each image pattern is calculated；

In step 404, the tag along sort corresponding according to the acquisition probability acquisition image of each image pattern；

In step 405, acquired image feeding there is into supervision deep learning network model training, and is lost with it Value；

In a step 406, judging this has whether supervision deep learning network model reaches the trained the number of iterations upper limit, if reaching Training is then terminated to the upper limit, it is no to then follow the steps 407；

In step 407, the penalty values of each image pattern of gained are calculated using step 405；

In a step 408, the weight of each sample is updated, executes step 403 after the completion.

The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be within the scope of protection determined by the claims.

Claims

1. a kind of have supervision deep learning method based on feedback training, which is characterized in that this method has supervision depth in training During learning model, when each iteration starts, each sample in training set is sampled with a sampled probability, it is described Sampled probability is adjusted with the prediction penalty values dynamic of each sample.

2. according to claim 1 have supervision deep learning method based on feedback training, which is characterized in that the sampling The process of probability dynamic adjustment specifically includes:

1) each sample weight parameter is initialized；

3) after carrying out an iteration, the prediction penalty values of each sample are obtained, update weight parameter based on the prediction penalty values；

4) when next iteration starts, p is enabled_i=p (i), return step 2).

3. according to claim 2 have supervision deep learning method based on feedback training, which is characterized in that described initial When changing each sample weight parameter, enabling each sample weight parameter is 1.

4. according to claim 2 have supervision deep learning method based on feedback training, which is characterized in that described to be based on The prediction penalty values update weight parameter specifically:

P (i)=| δ (i) |+ε

Wherein, p (i) is the weight parameter of updated sample i, and δ (i) is the prediction penalty values of sample i, and ε is modifying factor.

5. according to claim 4 have supervision deep learning method based on feedback training, which is characterized in that the amendment Factor ε is a positive number for being greater than 0.

6. according to claim 4 have supervision deep learning method based on feedback training, which is characterized in that the prediction The expression formula of penalty values δ (i) are as follows:

δ (i)=L (y_i,f(x_i))

Wherein, x_iFor input, y_iFor x_iCorresponding true value label, function f are by inputting x_iThe function of prediction label, function L are Calculate true value label y_iWith prediction label f (x_i) difference loss function.

7. according to claim 2 have supervision deep learning method based on feedback training, which is characterized in that described to be based on When the prediction penalty values update weight parameter, weight parameter is directly proportional to the inverse of prediction penalty values.