CN109816092A

CN109816092A - Deep neural network training method, device, electronic equipment and storage medium

Info

Publication number: CN109816092A
Application number: CN201811528375.XA
Authority: CN
Inventors: 柴振华; 孟欢欢
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2019-05-28
Anticipated expiration: 2038-12-13
Also published as: CN109816092B

Abstract

This application discloses a kind of deep neural network training methods, belong to field of computer technology, for solving the problems, such as that neural network trained in the prior art performance under complex scene is lower.The described method includes: obtaining several training samples for being provided with pre-set categories label, it is based on several training samples, training neural network model；Wherein, the loss function of the neural network model is used to be weighted according to first weight directly proportional to the differentiation difficulty of each training sample, determines the penalty values of the neural network model.Deep neural network training method disclosed in the embodiment of the present application, importance of the biggish training sample of difficulty in training sample is distinguished by adaptive boosting, it avoids distinguishing the neural network mistake classification that the biggish sample of difficulty is trained to, facilitates the performance for promoting the neural network.

Description

Deep neural network training method, device, electronic equipment and storage medium

Technical field

This application involves field of computer technology, more particularly to a kind of deep neural network training method, device, electronics Equipment and storage medium.

Background technique

In recent years, deep learning achieves significant progress in area of pattern recognition, and key factor may include: network Model is abundant flexibly, operational capability is strong, more adapts to big data processing.As neural network is applied to different task, for mind Improvement through network model is also the critical issue of those skilled in the art's research.Neural network model is changed in the prior art Into be concentrated mainly on two aspect: network structure and loss function.Wherein, the loss function for being usually used in disaggregated model training is main It is on Softmax loss and 2016 worlds Nian top meeting ECCV based on the improved Center loss of Softmax.However, Applicant by the prior art use Center loss as loss function neural network progress the study found that if The training sample weaker there are the biggish sample of noise or distinction, utilizes existing damage in the training sample of training neural network It loses the model that function training obtains to be tested, the classification of neural network or the promotion of recognition result for obtain training all have Limit.Therefore, deep neural network training is carried out the characteristics of combined training sample, it can the obtained neural network of training for promotion Performance, and then promote the accuracy that the obtained neural network of application training carries out Classification and Identification.

Summary of the invention

The application provides a kind of deep neural network training method, facilitates the property for the neural network that training for promotion obtains Can, to promote the accuracy that the neural network that application training obtains carries out Classification and Identification.

To solve the above-mentioned problems, in a first aspect, the embodiment of the present application provides a kind of deep neural network training method, Include:

Obtain several training samples for being provided with pre-set categories label；

Based on several training samples, training neural network model；

Wherein, the loss function of the neural network model is used for according to the differentiation difficulty with each training sample at just First weight of ratio is weighted, and determines the penalty values of the neural network model, the differentiation of each training sample Away from directly proportional at a distance from respective classes center, the respective classes center is to several training for difficulty and the training sample The class center of the classification in classification comprising the training sample is obtained after sample clustering.

Second aspect, the embodiment of the present application provide a kind of deep neural network training device, comprising:

Training sample obtains module, for obtaining several training samples for being provided with pre-set categories label；

Model training module, for being based on several training samples, training neural network model；

Wherein, the loss function of the neural network model is used for according to the differentiation difficulty with each training sample at just First weight of ratio is weighted, and determines the penalty values of the neural network model.

The third aspect, the embodiment of the present application also disclose a kind of electronic equipment, including memory, processor and are stored in institute The computer program that can be run on memory and on a processor is stated, the processor realizes this when executing the computer program Apply for deep neural network training method described in embodiment.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence, when which is executed by processor disclosed in the embodiment of the present application the step of deep neural network training method.

Deep neural network training method disclosed in the embodiment of the present application, if being provided with pre-set categories label by obtaining Dry training sample is based on several training samples, training neural network model；Wherein, the loss of the neural network model Function is used to be weighted according to first weight directly proportional to the differentiation difficulty of each training sample, determines the mind Penalty values through network model, the performance under complex scene for solving the neural network that training obtains in the prior art are lower Problem.Deep neural network training method disclosed in the embodiment of the present application is mentioned automatically by improving the loss function of neural network The importance for distinguishing the biggish training sample of difficulty in training sample is risen, the differentiation biggish training sample of difficulty is avoided to be trained to Obtained neural network mistake classification, facilitates the performance for the neural network that training for promotion obtains, to promote what application obtained The accuracy of neural network progress Classification and Identification.

Detailed description of the invention

Technical solution in ord to more clearly illustrate embodiments of the present application, below will be in embodiment or description of the prior art Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the application Example is applied, it for those of ordinary skill in the art, without any creative labor, can also be attached according to these Figure obtains other attached drawings.

Fig. 1 is the deep neural network training method flow chart of the embodiment of the present application one；

Fig. 2 is cluster result schematic diagram in the deep neural network training method of the embodiment of the present application one；

Fig. 3 is the flow chart that the neural network training method based on the embodiment of the present application one carries out object classification identification；

Fig. 4 is one of deep neural network training device structural schematic diagram of the embodiment of the present application three；

Fig. 5 is the deep neural network training device second structural representation of the embodiment of the present application three；

Fig. 6 is the deep neural network training device third structural representation of the embodiment of the present application three.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.

Embodiment one

A kind of deep neural network training method disclosed in the present embodiment, as shown in Figure 1, this method comprises: step 110 and Step 120.

Step 110, several training samples for being provided with pre-set categories label are obtained.

Before training neural network, it is necessary first to obtain several training samples for being provided with pre-set categories label.

Different according to specific application scenarios, the form of training sample is different.For example, in work clothes identification application, Training sample is work clothes image；In face In vivo detection application scenarios, training sample image acquires the living body people of equipment acquisition The image and non-living body face (such as faceform, human face photo) of face；In voice recognition application scene, training sample is one section Audio.

Difference is exported according to specific identification mission, the class label of training sample is different.Work clothes are executed with training For the neural network of identification mission, exported according to specific identification mission, the classification of training sample may include being used to indicate Meituan takes out work clothes and the class label of the different work clothes classifications such as work clothes is taken out by Baidu.The mind of voice recognition task is executed with training For network, exported according to specific identification mission, the classification of training sample may include being used to indicate boy student and schoolgirl etc. The class label of alternative sounds classification.Again by taking training executes the neural network of face In vivo detection task as an example, training sample Classification may include the class label for being used to indicate living body faces and non-living body face two categories.

Step 120, several training samples, training neural network model are based on.

In some embodiments of the present application, before training neural network model, further includes: to the same category mark Several training samples of label are clustered, and determine the class center of category training sample.Each training sample Difficulty is distinguished with the training sample away from directly proportional at a distance from respective classes center.The specific method that training sample is clustered Referring to the prior art, repeated no more in the embodiment of the present application.

After getting and being provided with the training sample of pre-set categories label, deep neural network is further constructed, then, Based on the training sample, the neural network model of training building.

The application is when it is implemented, major network can choose ResNet (residual error network), such as ResNet50 residual error network, Then, by improving the loss function and training process of neural network, with the performance for the neural network that training for promotion obtains.Example Such as, it sets the loss function of the neural network to by softmax loss function and the center based on attention mechanism is lost The associated losses function that function is constituted, by improving the loss function of neural network, in the output penalty values for calculating loss function When, adaptive adjustment and cluster centre must influence the penalty values that loss function exports apart from biggish sample, i.e., adaptive to adjust Difficulty biggish sample in main plot point must influence the penalty values that loss function exports, to avoid the biggish sample of difficulty is distinguished in mould It is ignored in type training process, so as to cause the obtained category of model of training or the decline of the accuracy rate of identification.

When it is implemented, with the loss function of neural network model are as follows:Citing, whereinIt indicates Loss function,For softmax loss function, indicate are as follows: For based on note The center loss function for power mechanism of anticipating indicates are as follows:

In above-mentioned formula, i indicates the mark of training sample, and m indicates training The sum of sample, y_iIndicate the classification logotype that the loss function is input in the neural network model, x_iExpression belongs to classification y_iA training sample,The jth column of the weight matrix of the last one full articulamentum, b before indicating the loss function_j The jth of the last one full articulamentum deviation b arranges before indicating loss function,Indicate the last one before the loss function The y of the weight matrix of full articulamentum_iColumn,The y of the last one full articulamentum deviation b before expression loss function_iColumn,Indicate training sample x_iThe first weight, with cluster centreBetween distance it is directly proportional, m indicate training sample quantity, T Indicate that transposition, λ indicate scalar.γ indicates scalar, and value is the number greater than 0, as γ is equal to 2.

When it is implemented, the closer sample of distance classification hyperplane is easy mistake under influence of noise after for preliminary classification It is categorized into other classifications and system is caused to judge incorrectly.Preliminary classification result schematic diagram as shown in Figure 2, wherein sample 211, 212,221,222 distance classification hyperplane are closer, i.e., farther out apart from respective class center, under influence of noise, 211 He of sample 212 easy misclassifications are into classification 22, and similarly, the easy misclassification of sample 221 and 222 is into classification 21.If neural network Softmax function is used only in loss function, i.e., in above-mentioned formulaThen can due to only accounting for the distinction between sample, And identification (not only to separate, it is also necessary to have certain interval) is ignored, neural network low so as to cause classification effectiveness Model performance reduces.Because in a practical situation, for the sample of some updates, some closer samples of distance classification hyperplane Misclassification is easy under influence of noise causes system to judge incorrectly to other classifications.

Therefore training sample is comprehensively considered by increasing the center loss function based on attention mechanism in the prior art Influence of the distance between the classification to penalty values.However, unanimously treating to each training sample in the prior art, no matter sample is trained , apart from class center apart from size, influence of the training sample to the penalty values that loss function exports all is identical for this.And it passes through Repetition test is crossed, the class center distance of inventor's discovery such as training sample 211 and 212 and classification 21 in Fig. 2 is to loss letter The influence of several penalty values should not be identical.For training the neural network of work clothes classification task, Meituan work clothes and puts me and reach Work clothes belong to the classification that Meituan takes out work clothes, and Meituan work clothes put me up to work usually apart from the closer work clothes of cluster centre Clothes are due to negligible amounts, after classification, put me up to work clothes as apart from the farther away work clothes of cluster centre.If calculating loss function Penalty values when, it is all identical for putting the influence for the penalty values that I exports up to work clothes and Meituan work clothes, that is, puts me up to work clothes and beauty Group's work clothes have same sample weights, then will lead to trained neural network when carrying out work clothes identification, and mistake will put me and reach Work clothes and Meituan work clothes are categorized into different classes of.Therefore, the present embodiment passes through adaptively according to training sample and cluster centre Distance is that corresponding weight is arranged in training sample, distinguishes the biggish training sample of difficulty to loss function to be promotedPenalty values Influence.When it is implemented, training sample x_iWeight, with cluster centreBetween distance it is directly proportional.

In specific training process, computer is corresponding by each training sample for executing above-mentioned loss function calculating input Loss function value, and compare the difference between the loss function value being calculated and sample label；Then, by constantly adjusting damage The parameter for losing function, calculates the loss function value of the corresponding different parameters of each training sample of input repeatedly, final to determine completely The smallest parameter of difference between sufficient loss function value and sample label, the parameter as loss function.And based on determining ginseng Number calculates the corresponding output valve of sample of input neural network.The output valve of neural network corresponds to the classification results of input sample.

It, can be by reverse conduction and gradient descent method optimization neural network, so that loss function is defeated when specific training Penalty values out are minimum, thus the neural network after being optimized.

In some embodiments of the present application, the method that further can also adjust cluster centre to gradient descent method carries out excellent Change, by considering that influence of the training sample when adjusting cluster centre is arranged in the differentiation difficulty of training sample simultaneously, with further The performance for the network model that training for promotion obtains.

The specific training process of neural network may refer to the prior art, repeat no more in the present embodiment.

The application using infrastructure network of other network structures as neural network when it is implemented, can also be appointed All there is the optimization training of loss function in the training process of what network structure, the application to the specific network structure of neural network not It limits, only the realization of loss function and optimization method is defined.

Deep neural network training method disclosed in the embodiment of the present application, if being provided with pre-set categories label by obtaining Dry training sample, and the identical several training samples of label are clustered, determine class center；Then, based on described Several training samples, training neural network model；Wherein, the loss function of the neural network model be used for according to it is each described The first weight that the differentiation difficulty of training sample is directly proportional is weighted, and determines the penalty values of the neural network model, Solve the problems, such as that the performance under complex scene for the neural network that training obtains in the prior art is lower.The embodiment of the present application is public The deep neural network training method opened, by improving the loss function of neural network, it is biggish that adaptive boosting distinguishes difficulty Importance of the training sample in training sample avoids distinguishing the neural network mistake that the biggish training sample of difficulty is trained to Misclassification facilitates the performance for the neural network that training for promotion obtains.

Embodiment two

Based on embodiment one, a kind of prioritization scheme of deep neural network training method disclosed in the present embodiment.

When it is implemented, building is neural first after getting several training samples for being provided with pre-set categories label Network.Still it uses ResNet50 (residual error network) as basic network in the present embodiment, constructs neural network, the nerve net Network includes multiple feature extraction layers.Neural network successively calls each feature extraction layer (as entirely by carrying out the propagated forward stage Articulamentum) forward direction function, obtain layer-by-layer output, the last layer obtains loss function compared with objective function, calculates error Updated value.Then, first layer is successively reached by backpropagation, all weights update together at the end of backpropagation.Finally For one feature extraction layer using the feature of extraction as the predicted value input value loss function of neural network, loss function passes through a system Column count obtains the difference of predicted value and true tag, is determined as the penalty values of neural network.The purpose of training neural network is just It is to keep predicted value and the difference of true tag minimum.

In other preferred embodiments of the application, the first weightCan by training sample and class center away from From normal distyribution function indicate, such as:Wherein, σ is constant, x_iExpression belongs to class Other y_iA training sample,Indicate classification y_iCenter.The differentiation difficulty of training sample it can be seen from above formula It is bigger, i.e. training sample x_iWith its generic centerDistance it is bigger, the first weightValue is bigger.That is training sample Differentiation difficulty it is bigger, in training neural network, importance needs corresponding promoted.

In the other embodiments of the application, calculating the first weight can also be by training sample and class center distance Other proportional relationship formula indicate that the present embodiment does not enumerate.

Specific in the present embodiment, during backpropagation, by constantly adjusting cluster centre, so that loss function The penalty values of error are minimum between the prediction codomain true value of the expression training sample of output.When it is implemented, usually passing through public affairs FormulaTo update cluster centreThe application is in the renewal amount for updating the class centerWhen, weight Point considers the training sample of wanting close to class center, weakens the training sample far from class center.Therefore, the classification is being updated When center, the neural network model is used to be carried out according to the second weight that the differentiation difficulty with each training sample is inversely proportional Ranking operation determines the variable quantity at respective classes center.For the classification results described in Fig. 2, updated in calculated value described in The renewal amount of the class center of classification 22When, it is smaller similar to the differentiation difficulty of sample 222 that emphasis examines rate, i.e., in classification Contribution of the closer training sample of the heart to renewal amount.And the differentiation difficulty of similar sample 221 is larger, i.e., farther out apart from class center Training sample the contribution of renewal amount will be weakened.When it is implemented, can be by for different training samples, according to the instruction Practice sample and it is arranged corresponding weight at a distance from its generic center, to calculate the renewal amount of class center

For example, the second weight that the basis and the differentiation difficulty of each training sample are inversely proportional is weighted, The variable quantity for determining respective classes center includes: according to formula

Determine class center c_jVariable quantity, wherein i indicates training sample Mark, m indicate the sum of training sample, j and y_iIndicate the classification that the loss function is input in the neural network model Mark, x_iExpression belongs to classification y_iA training sample, q_iIndicate the second weight, q_iWith training sample x_iWith class center c_jIt Between distance be inversely proportional, δ () be Dirac function, δ=1 when condition meets in bracket, otherwise=0, α_cFor in control category The scalar of the learning rate of the heart, value range are [0,1].

In conjunction with the embodiments one, it is indicated with loss function are as follows:Wherein, the center based on attention mechanism Loss function indicates are as follows: Example illustrates the concrete scheme for adjusting cluster centre.

By calculating center penalty values to the partial derivative of sample, the variable quantity of class center, specific derivation of equation mistake are determined Journey is as follows:

Further,Thus Out:

Wherein,σ_cFor constant, x_iTable Show and belongs to classification y_iA training sample,Indicate classification y_iCenter.I.e. in some embodiments of the present application, second Weight q_iIt can be indicated by training sample and the normal distyribution function of class center distance.

In the other embodiments of the application, the second weight q_iIt can also be by its of training sample and class center distance He inversely prroportional relationship formula indicates that the present embodiment does not enumerate.Preferably, the calculation method of the second weight and the first weight Calculation method matching.

By when updating class center, weakening influence of the more indistinguishable training sample to class center, promoted simultaneously Be easier to influence of the training sample to class center distinguished, can neural network that further training for promotion obtains classification it is quasi- True property, promotes the performance of neural network model.

In other preferred embodiments of the application, the loss function of the neural network model is used for: in the mind When calculating the penalty values of the training sample through network model, the third that is inversely proportional by the classification accounting with the training sample Weight adjusts the penalty values of the training sample.

Specifically, the loss function of the neural network model indicates are as follows:

Wherein,It indicates Loss function, i indicate the mark of training sample, and m indicates the sum of training sample, y_iIt indicates to input in the neural network model To the classification logotype of the loss function, x_iExpression belongs to classification y_iA training sample,Indicate the loss function it The jth of the weight matrix of the last one preceding full articulamentum arranges, b_jThe last one full articulamentum deviation b before expression loss function Jth column,The y of the weight matrix of the last one full articulamentum before indicating the loss function_iColumn,Indicate loss The y of the last one full articulamentum deviation b before function_iColumn, (1-k_i)^γIndicate training sample x_iThe first weight, k_iValue is Number greater than 0 and less than 1, and k_iWith training sample x_iWith training sample x_iThe distance at generic center is inversely proportional,It indicates Training sample x_iThird weight,Value and classification y_iAccounting of the training sample in full dose training sample be inversely proportional, T indicates that transposition, λ and γ indicate scalar.When it is implemented, k_iOther inverse ratios of training sample and class center distance can be passed through Example relation formula indicates.

Embodiment three

The embodiment of the present application also discloses a kind of deep neural network training method, is applied in classification application.Such as Fig. 3 institute Show, which comprises step 310 to step 370.

Step 310, several training samples for being provided with pre-set categories label are obtained.

The application is when it is implemented, the training sample includes following any one: image, text, voice.It is directed to not Same object to be sorted needs to obtain the training sample of corresponding object to be sorted in neural network model.In the present embodiment, By taking training is for the neural network model of work clothes identification as an example, firstly, the work clothes image for being provided with different platform label is obtained, Such as: being provided with Meituan and take out the work clothes image of platform label, the work clothes image for being provided with the platform label that is hungry, be provided with Baidu Take out the work clothes image etc. of platform label.

Step 320, several training samples with the same category label are clustered, determines category training sample This class center.

The differentiation difficulty of each training sample is with the training sample away from directly proportional at a distance from respective classes center.

Several training samples with the same category label are clustered, determine the classification of category training sample Center is referring to embodiment two, and this embodiment is not repeated.

Step 330, several training samples, training neural network model are based on.

Later, the work clothes image training neural network model based on acquisition.

When work clothes image based on acquisition trains neural network model, first to the same category mark in the training sample of acquisition The training sample of label is clustered, and determines the corresponding class center of training sample of every kind of class label.Cluster is obtained Each classification in training sample, sample characterizes the differentiation difficulty of the sample: sample and classification at a distance from class center The distance at center is bigger, and the differentiation difficulty of the sample is bigger；Conversely, sample is smaller at a distance from class center, the area of the sample Divide difficulty smaller.

The training process of neural network is exactly that computer calculates the trained sample of each of input by executing above-mentioned loss function This corresponding loss function value, and compare the difference between the loss function value being calculated and sample label；Then, by not The parameter of disconnected adjustment loss function, calculates the loss function value of the corresponding different parameters of each training sample of input, most repeatedly The smallest parameter of difference met between loss function value and sample label, the process of the parameter as loss function are determined eventually.

The specific embodiment of work clothes image training neural network model based on acquisition is referring to previous embodiment one and in fact Example two is applied, this embodiment is not repeated.

Step 340, the object data with the matched object to be sorted of the training sample is obtained by data acquisition equipment.

For different objects to be sorted, the number of objects of the object to be sorted is obtained by corresponding data acquisition equipment According to.For example, when object to be sorted is work clothes the image of take-away personnel can be acquired, by camera to obtain work clothes image Data.

Step 350, the characteristic of division of the object data is obtained.

Same method when according to training pattern obtains the feature vector of work clothes image data, as characteristic of division.

Step 360, the characteristic of division is input to the neural network model of training completion, obtains the nerve net The output result of network model.

The characteristic of division that will acquire is input to the neural network model that signature step training is completed, to obtain the mind Output result through network model.

Step 370, predetermined registration operation is executed according to the output result.

In some embodiments of the present application, it is described according to the output result execute predetermined registration operation include following any one Item is multinomial: showing the classification results of the object to be sorted corresponding with the output result；It is defeated according to the output result Access control signal out；Worksheet processing operation is executed according to the output result.

Specifically, identifying application scenarios for work clothes, when object to be identified is work clothes, this step acquires output Work clothes image classification is the confidence level of respective classes label.When it is implemented, platform is taken out for Meituan, it is a certain outer when recognizing When the personnel of selling wear Meituan work clothes, then take-away personnel can be executed with worksheet processing operation；It is worn when recognizing a certain take-away personnel When Baidu's work clothes, then worksheet processing operation is not executed to take-away personnel.

By neural network training method disclosed in the present application, in mistake of the training for the neural network of work clothes identification mission Cheng Zhong, since the training sample of Meituan work clothes is very more, and the training sample for putting me up to work clothes is fewer, Meituan work clothes and puts me Up to work clothes class label having the same, therefore, can suitably be promoted it is each put I up to work clothes training sample in sample training mistake Influence in journey is promoted and each puts me up to influence of the work clothes to the penalty values of the output of loss function, to avoid due to certain class Such sample caused by sample is less is ignored in assorting process, so as to cause trained obtained neural network to such sample The phenomenon that classification error, occurs.

Inventor is by being based on identical training dataset, the execution work clothes identification that training is constructed based on different loss functions The neural network model of task, and test discovery is carried out by the neural network that same test data set obtains training, it is based on The neural network model classification accuracy of the execution work clothes identification mission of the building of loss function disclosed in the present embodiment has very big It is promoted.For example, that is, using Meituan work clothes characteristics of image as test sample, being tested with the practical sampling observation figure of Meituan special delivery jockey When, the classification accuracy of the neural network based on the building of softmax loss function is 97.39%, to be based in the prior art Softmax loss function and the classification accuracy of the neural network of the center loss function building based on attention mechanism are 97.41%, and with it is disclosed in the present application based on softmax loss function and combine ranking operation based on attention mechanism The classification accuracy of the neural network of center loss function building is 98.17%.Figure is inspected by random samples with non-Meituan special delivery jockey (comes from portion Focus packet jockey, registration figure and the jockey's sampling observation figure for wearing other work clothes), i.e., with non-Meituan work clothes characteristics of image and non-dots, I reaches Work clothes characteristics of image is as test sample, when carrying out attack test, point of the neural network based on the building of softmax loss function Class is that the probability of Meituan work clothes is 0.41%, in the prior art based on softmax loss function and based on attention mechanism The probability that the classification of the neural network of center loss function building is similarly Meituan work clothes is 0.41%, and with disclosed in the present application Nerve net based on softmax loss function and the center loss function building based on attention mechanism for combining ranking operation The probability for being classified as Meituan work clothes of network is 0.40%, and classification error rate is decreased obviously.

Neural network training method disclosed in the embodiment of the present application is provided with several instructions of pre-set categories label by obtaining Practice sample, and several training samples with the same category label are clustered, determines the class of category training sample Then other center is based on several training samples, training neural network model during application on site, is acquired by data Equipment obtains the object data with the matched object to be sorted of the training sample, and further obtains point of the object data Category feature；Later, the characteristic of division is input to the neural network model of training completion, obtains the neural network mould The output of type is as a result, finally, execute predetermined registration operation according to the output result, due to the loss function of the neural network model For being weighted according to first weight directly proportional to the differentiation difficulty of each training sample, the nerve net is determined The penalty values of network model, also, the differentiation difficulty of each training sample and the training sample are away from respective classes center Apart from directly proportional, therefore, the accuracy that the model obtained based on training determines object classification can be promoted, to accurately execute pre- If operation.

Example IV

A kind of deep neural network training device disclosed in the present embodiment, as shown in figure 4, described device includes:

Training sample obtains module 410, for obtaining several training samples for being provided with pre-set categories label；

Model training module 420, for being based on several training samples, training neural network model；

Optionally, as shown in figure 5, described device further include:

Cluster module 430 determines such for clustering to several training samples with the same category label The class center of other training sample；

Optionally, when updating the class center, the neural network model is used for basis and each training sample The second weight for being inversely proportional of differentiation difficulty be weighted, determine the variable quantity at respective classes center.

Optionally, the loss function of the neural network model is used for: calculating the training in the neural network model When the penalty values of sample, the third weight being inversely proportional by the classification accounting with the training sample adjusts the training sample Penalty values.

Further alternative, the loss function of the neural network model indicates are as follows:

Wherein, whereinIndicate that loss function, i indicate the mark of training sample, m indicates the sum of training sample, y_iIndicate the neural network model In be input to the classification logotype of the loss function, x_iExpression belongs to classification y_iA training sample,Indicate the loss The jth column of the weight matrix of the last one full articulamentum, b before function_jThe last one full articulamentum before expression loss function The jth of deviation b arranges,The y of the weight matrix of the last one full articulamentum before indicating the loss function_iColumn,Table The y of the last one full articulamentum deviation b before showing loss function_iColumn, (1-k_i)^γIndicate training sample x_iThe first weight, k_i Value is number greater than 0 and less than 1, and k_iWith training sample x_iWith training sample x_iThe distance at generic center is inversely proportional,Indicate training sample x_iThird weight,Value and classification y_iAccounting of the training sample in full dose training sample It is inversely proportional, T indicates that transposition, λ and γ indicate scalar.

In other preferred embodiments of the application,Wherein, σ is constant, x_iIt indicates Belong to classification y_iA training sample,Indicate classification y_iCenter.The area of training sample it can be seen from above formula Divide difficulty bigger, i.e. training sample x_iWith its generic centerDistance it is bigger, k_iIt is smaller, the first weightValue is got over Greatly.That is the differentiation difficulty of training sample is bigger, and in training neural network, importance needs corresponding promoted.

During backpropagation, by constantly adjusting cluster centre, so that the expression training sample of loss function output The penalty values of error are minimum between this prediction codomain true value.When it is implemented, usually passing through formulaTo update cluster centreThe application is in the renewal amount for updating the class centerWhen, emphasis Consider the training sample of wanting close to class center, weakens the training sample far from class center.It is further alternative, the basis The second weight being inversely proportional with the differentiation difficulty of each training sample is weighted, and determines the variation at respective classes center Amount includes:

According to formulaDetermine class center c_jVariable quantity, wherein i indicate instruction Practice the mark of sample, m indicates the sum of training sample, j and y_iIt indicates to be input to the loss letter in the neural network model Several classification logotypes, x_iExpression belongs to classification y_iA training sample, q_iIndicate the second weight, q_iWith training sample x_iAnd classification Center c_jThe distance between be inversely proportional, δ () be Dirac function, δ=1 when condition meets in bracket, otherwise=0, α_cFor control The scalar of the learning rate of class center processed, value range are [0,1].

In some embodiments of the present application,σ_cFor constant, x_iExpression belongs to classification y_i A training sample,Indicate classification y_iCenter.

Optionally, the training sample includes following any one: image, text, voice.

Optionally, as shown in fig. 6, described device further include:

Data acquisition module 440, it is matched to be sorted right with the training sample for being obtained by data acquisition equipment The object data of elephant；

Feature obtains module 450, for obtaining the characteristic of division of the object data；

Model calling module 460 is obtained for the characteristic of division to be input to the neural network model of training completion Obtain the output result of the neural network model；

Execution module 470, for executing predetermined registration operation according to the output result.

Optionally, it is described according to the output result execute predetermined registration operation include following any one or more:

Show the classification results of the object to be sorted corresponding with the output result；

Access control signal is exported according to the output result；

Worksheet processing operation is executed according to the output result.

Deep neural network training device disclosed in the embodiment of the present application, for realizing the embodiment of the present application one and embodiment Each step of deep neural network training method described in two, the specific embodiment of each module of device is referring to corresponding step Suddenly, details are not described herein again.

Deep neural network training device disclosed in the embodiment of the present application, if being provided with pre-set categories label by obtaining Dry training sample, and several training samples with the same category label are clustered, determine category training sample Class center, be based on several training samples, training neural network model；Wherein, the loss of the neural network model Function is used to be weighted according to first weight directly proportional to the differentiation difficulty of each training sample, determines the mind Penalty values through network model, the performance under complex scene for solving the neural network that training obtains in the prior art are lower Problem.Deep neural network training device disclosed in the embodiment of the present application, by improving the loss function of neural network, adaptively The importance for distinguishing the biggish training sample of difficulty in training sample is promoted, the differentiation biggish training sample of difficulty is avoided to be instructed The network model mistake classification got, facilitates the performance for the neural network that training for promotion obtains.

Neural metwork training device disclosed in the embodiment of the present application during application on site, is obtained by data acquisition equipment The object data with the matched object to be sorted of the training sample is taken, and the classification for further obtaining the object data is special Sign；Later, the characteristic of division is input to the neural network model of training completion, obtains the neural network model Output is as a result, finally, execute predetermined registration operation according to the output result, since the loss function of the neural network model is used for It is weighted according to first weight directly proportional to the differentiation difficulty of each training sample, determines the neural network mould The penalty values of type, also, the differentiation difficulty of each training sample and the training sample away from respective classes center at a distance from It is directly proportional, therefore, the accuracy that the model obtained based on training determines object classification can be promoted, to accurately execute default behaviour Make.

Correspondingly, disclosed herein as well is a kind of electronic equipment, including memory, processor and it is stored in the memory Computer program that is upper and can running on a processor, the processor are realized when executing the computer program as the application is real Apply deep neural network training method described in example one to embodiment three.The electronic equipment can be PC machine, mobile terminal, a Personal digital assistant, tablet computer etc..

Disclosed herein as well is a kind of computer readable storage mediums, are stored thereon with computer program, which is located Manage the step of realizing the deep neural network training method as described in the embodiment of the present application one to embodiment three when device executes.

All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.For Installation practice For, since it is basically similar to the method embodiment, so being described relatively simple, referring to the portion of embodiment of the method in place of correlation It defends oneself bright.

A kind of deep neural network training method provided by the present application and device are described in detail above, herein Applying specific case, the principle and implementation of this application are described, and the explanation of above example is only intended to help Understand the present processes and its core concept；At the same time, for those skilled in the art, according to the thought of the application, There will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as to this The limitation of application.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware realization.Based on such reason Solution, substantially the part that contributes to existing technology can embody above-mentioned technical proposal in the form of software products in other words Come, which may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including Some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes respectively Method described in certain parts of a embodiment or embodiment.

Claims

1. a kind of deep neural network training method characterized by comprising

Based on several training samples, training neural network model；

Wherein, the loss function of the neural network model is used for according to directly proportional to the differentiation difficulty of each training sample First weight is weighted, and determines the penalty values of the neural network model.

2. the method according to claim 1, wherein described be based on several training samples, training nerve net Before the step of network model, further includes:

Several training samples with the same category label are clustered, in the classification for determining category training sample The heart；

3. according to the method described in claim 2, it is characterized in that, when updating the class center, the neural network mould Type is configured as being weighted according to the second weight that the differentiation difficulty with each training sample is inversely proportional, and determines corresponding The variable quantity of class center.

4. according to the method described in claim 3, it is characterized in that, the loss function of the neural network model is used for: in institute When stating neural network model and calculating the penalty values of the training sample, it is inversely proportional by the classification accounting with the training sample Third weight adjusts the penalty values of the training sample.

5. according to the method described in claim 4, it is characterized in that, the loss function of the neural network model indicates are as follows:

Wherein,Indicate loss Function, i indicate the mark of training sample, and m indicates the sum of training sample, y_iIt indicates to be input to institute in the neural network model State the classification logotype of loss function, x_iExpression belongs to classification y_iA training sample, W_j ^TIndicate the loss function before most The jth of the weight matrix of the full articulamentum of the latter arranges, b_jThe jth of the last one full articulamentum deviation b before expression loss function Column,The y of the weight matrix of the last one full articulamentum before indicating the loss function_iColumn,Indicate loss function The y of the last one full articulamentum deviation b before_iColumn, (1-k_i)^γIndicate training sample x_iThe first weight, k_iValue be greater than 0 and the number less than 1, and k_iWith training sample x_iWith training sample x_iThe distance at generic center is inversely proportional,Indicate training Sample x_iThird weight,Value and classification y_iAccounting of the training sample in full dose training sample be inversely proportional, T table Show that transposition, λ and γ indicate scalar.

6. according to the method described in claim 3, it is characterized in that, the differentiation difficulty of the basis and each training sample at Second weight of inverse ratio is weighted, and determines that the variable quantity at respective classes center includes:

According to formulaDetermine class center c_jVariable quantity, wherein i indicate training sample This mark, m indicate the sum of training sample, j and y_iIt indicates to be input to the loss function in the neural network model Classification logotype, x_iExpression belongs to classification y_iA training sample, q_iIndicate the second weight, q_iWith training sample x_iAnd class center c_jThe distance between be inversely proportional, δ () be Dirac function, δ=1 when condition meets in bracket, otherwise=0, α_cTo control class The scalar of the learning rate at other center, value range are [0,1].

7. method according to any one of claims 1 to 6, which is characterized in that the training sample includes following any one : image, text, voice.

8. the method according to the description of claim 7 is characterized in that the method also includes:

The object data with the matched object to be sorted of the training sample is obtained by data acquisition equipment；

Obtain the characteristic of division of the object data；

The characteristic of division is input to the neural network model of training completion, obtains the output of the neural network model As a result；

Predetermined registration operation is executed according to the output result.

9. according to the method described in claim 8, it is characterized in that, described include according to output result execution predetermined registration operation It is any one or more below:

Access control signal is exported according to the output result；

Worksheet processing operation is executed according to the output result.

10. a kind of deep neural network training device characterized by comprising

11. device according to claim 10, which is characterized in that described device further include:

Cluster module determines category training for clustering to several training samples with the same category label The class center of sample；

12. device according to claim 11, which is characterized in that when updating the class center, the neural network Model is used to be weighted according to the second weight that the differentiation difficulty with each training sample is inversely proportional, and determines respective class The variable quantity at other center.

13. device according to claim 12, which is characterized in that the loss function of the neural network model is used for: When the neural network model calculates the penalty values of the training sample, it is inversely proportional by the classification accounting with the training sample Third weight, adjust the penalty values of the training sample.

14. device according to claim 13, which is characterized in that the loss function of the neural network model indicates are as follows:

15. device according to claim 10, which is characterized in that the differentiation difficulty of the basis and each training sample The second weight being inversely proportional is weighted, and determines that the variable quantity at respective classes center includes:

16. device according to any one of claims 10 to 15, which is characterized in that the training sample includes following any One: image, text, voice.

17. device according to claim 16, which is characterized in that described device further include:

Data acquisition module, for obtaining the object with the matched object to be sorted of the training sample by data acquisition equipment Data；

Feature obtains module, for obtaining the characteristic of division of the object data；

Model calling module, for the characteristic of division to be input to the neural network model of training completion, described in acquisition The output result of neural network model；

Execution module, for executing predetermined registration operation according to the output result.

18. device according to claim 17, which is characterized in that described to execute predetermined registration operation packet according to the output result It includes following any one or more:

Access control signal is exported according to the output result；

Worksheet processing operation is executed according to the output result.

19. a kind of electronic equipment, including memory, processor and it is stored on the memory and can runs on a processor Computer program, which is characterized in that the processor realizes claim 1 to 9 any one when executing the computer program The deep neural network training method.

20. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The step of deep neural network training method described in claim 1 to 9 any one is realized when execution.