CN110490306A

CN110490306A - A kind of neural metwork training and object identifying method, device and electronic equipment

Info

Publication number: CN110490306A
Application number: CN201910778270.8A
Authority: CN
Inventors: 祝贺; 李永波; 李伯勋; 俞刚
Original assignee: Beijing Maigewei Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd; Beijing Maigewei Technology Co Ltd
Priority date: 2019-08-22
Filing date: 2019-08-22
Publication date: 2019-11-22

Abstract

The present invention provides a kind of neural metwork training and object identifying method, device and electronic equipment, the neural network training method includes: that sample data is inputted deep neural network, extracts the feature vector of the sample；Sample class quantity is obtained, the described eigenvector for setting each classification sample obeys independent Gaussian distribution, and sets the corresponding distribution parameter of each classification；Based on described eigenvector and the distribution parameter, the value of total losses function is calculated；Value based on the total losses function adjust the deep neural network parameter and the distribution parameter, until the total losses function value restrain.In this way, Gauss hypothesis is carried out by the regularity of distribution to one species feature, so as to be mapped from incidence relation of the angle of statistics to feature and classification, learn the mean value and variance of each classification by deep neural network in the training stage, to improve the accuracy of final training effect and classification.

Description

A kind of neural metwork training and object identifying method, device and electronic equipment

Technical field

The present invention relates to nerual network technique fields, in particular to a kind of neural metwork training and Object identifying side Method, device and electronic equipment.

Background technique

Picture classification, which refers to, carries out discriminant classification to given picture, is one of major issue of computer vision field. Great promotion is obtained on the discrimination precision of image using the image classification algorithms of deep neural network in recent years, at these In algorithm, reasonable loss function can the study of guidance and supervision deep neural network arrive more accurate mode classification, thus real Now more accurate classifying quality.

But existing deep neural network, only by setting loss function come guidance and supervision deep neural network into The more accurate mode classification of row, to realize more accurate classifying quality.Either loss function design and deep neural network Study all seldom considers the distribution character of Different categories of samples itself；This is allowed for for deep neural network, classification Accuracy is affected, and is unable to reach ideal height.

Therefore, how more reasonable loss function is designed from sample distribution and deep neural network learns, thus It supervises deep neural network and realizes more accurately classification, be a urgent problem needed to be solved.

Summary of the invention

Problems solved by the invention is how to combine distribution character of training sample, and supervision deep neural network, which is realized, more accurately divides Class.

To solve the above problems, present invention firstly provides a kind of neural network training methods comprising:

Sample data is inputted into deep neural network, extracts the feature vector of the sample；

Sample class quantity is obtained, the described eigenvector for setting each classification sample obeys independent Gaussian distribution, and sets Determine the corresponding distribution parameter of each classification；

Based on described eigenvector and the distribution parameter, the value of total losses function is calculated；

Value based on the total losses function adjust the deep neural network parameter and the distribution parameter, Zhi Daosuo State the value convergence of total losses function.

In this way, carrying out Gauss by the regularity of distribution to one species feature it is assumed that so as to from angle of statistics pair The incidence relation of feature and classification is mapped, and learns the mean μ of each classification by deep neural network in the training stage And variances sigma, so as to accurately reflect the incidence relation of feature and the type by Gaussian Profile, and then improve final instruction Practice the accuracy of effect and classification.

Optionally, described to be based on described eigenvector and the distribution parameter, in the value for calculating total losses function, also set Interval in classification interval and class, and based on interval, described eigenvector and distribution ginseng in the classification interval, the class Number calculates the value of total losses function.

In this way, total losses function is influenced by divergence and class scatter in classification interval and interior spacing constraint class, thus Same category is bound and is divided, it is small big with class scatter to constrain divergence in class, to reach more accurately training effect Fruit.

Optionally, it is spaced in the setting classification interval and class, and based on interval, institute in the classification interval, the class Feature vector and the distribution parameter are stated, the value of total losses function is calculated, comprising:

Set the classification interval, the interior interval of the class；

Based on being spaced in described eigenvector, the distribution parameter and the class, computational discrimination distance matrix；

Based on the differentiation distance matrix and the classification interval calculation Classification Loss；

True classification and the distribution parameter based on sample calculate likelihood loss；

It is lost based on the Classification Loss and likelihood, calculates the value of total losses function.

In this way, can determine the value of total losses function by feature vector, true classification and distribution parameter；Pass through distribution Parameter calculates total losses parameter, and the corresponding relationship of distribution parameter and loss function can be established from angle of statistics, so that The corresponding relationship of distribution parameter and total losses function is more accurate, so that total losses function can more reflect the standard of actual classification True degree；In this way, can achieve better training effect by total losses function.

Optionally, interval β is greater than 0 in the classification interval α > 0, the class.

In this way, classification interval, gap between two classifications similar in fuzzy probability, thus being easiest to obscure part Loss becomes larger, and then improves training effect；By the division being spaced in class can reduce the other sample characteristics of same class it Between difference.

Optionally, the differentiation distance matrix is by differentiating distance d_kComposition differentiates distance d_kCalculation formula are as follows:

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, d_kFor the corresponding true classification of sample characteristics and The differentiation distance of k classification, f_iFor i-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics,For I-th of sample characteristics corresponds to the variance of true classification, μ_kThe mean value of k-th of classification, β are interval in class, and β is hyper parameter.

Optionally, the calculation formula of the Classification Loss are as follows:

In formula, L_clsFor Classification Loss, N is sample size, and i is sample serial number, and K is classification number, and k is classification sequence number, f_iFor I-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics, d_kFor the corresponding true classification of sample characteristics and The differentiation distance of k-th of classification, α are classification interval, and α is hyper parameter.

Optionally, the calculation formula of the likelihood loss are as follows:

In formula, L_lkdFor likelihood loss, N is sample size, and i is sample serial number, f_iFor i-th of sample characteristics, C (f_i) it is the The corresponding true classification of i sample characteristics,The mean value of true classification is corresponded to for i-th of sample characteristics,It is i-th A sample characteristics correspond to the variance of true classification.

In this way, the corresponding relationship of distribution parameter etc. with loss function is established by above-mentioned calculation formula, so that distribution Parameter etc. and the corresponding relationship of total losses function are more accurate, so that total losses function can more reflect the accurate of actual classification Degree；In this way, can achieve better training effect by total losses function.

Next provides a kind of object identifying method comprising:

Sample data is inputted into deep neural network, extracts the feature vector of the sample, it is corresponding to obtain each classification The distribution parameter；The deep neural network and the distribution parameter are using neural network training method training described above It obtains；

Based on described eigenvector and the corresponding distribution parameter of each classification, prediction categorical data is calculated；

Classification belonging to the sample data is judged based on prediction categorical data.

In this way, after can be to sample extraction feature, according to classification belonging to the feature of extraction and distribution parameter judgement sample, And deep neural network and distribution parameter due to carrying out feature extraction, it is obtained by the training of above-mentioned neural network training method, Therefore more accurate to the judgement of sample generic.

Optionally, the prediction categorical data is Prediction distance matrix, and the Prediction distance matrix is by sample to each class Other Prediction distance D_kComposition, the Prediction distance D_kCalculation formula are as follows:

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, D_kFor the Prediction distance of sample and k-th of classification, f_iFor the feature vector of i-th of sample,For the variance of k-th of classification, μ_kThe mean value of k-th of classification, β are that interval, β are in class Hyper parameter.

In this way, can be according to feature vector and distribution parameter, directly calculating Prediction distance matrix, to pass through Prediction distance Matrix come reflect sample characteristics to each classification centre distance, wherein the centre distance of sample characteristics to classification is the smallest Classification, the as generic of sample.

Optionally, in the Prediction distance matrix, the smallest classification of the Prediction distance of the sample data to each classification For classification belonging to the sample data.

Again, a kind of neural metwork training device is provided comprising:

Feature extraction unit extracts the feature vector of the sample for sample data to be inputted deep neural network；

Parameter setting unit, for obtaining sample class quantity, the described eigenvector for setting each classification sample is obeyed Independent Gaussian distribution, and set the corresponding distribution parameter of each classification；

Costing bio disturbance unit calculates the value of total losses function for being based on described eigenvector and the distribution parameter；

Parameter adjustment unit adjusts parameter and the institute of the deep neural network for the value based on the total losses function Distribution parameter is stated, until the value of the total losses function restrains.

From secondary, a kind of object recognition equipment is provided comprising:

Feature output unit extracts the feature vector of the sample for sample data to be inputted deep neural network；

Classification computing unit calculates prediction class for being based on described eigenvector and the corresponding distribution parameter of each classification Other data；The deep neural network and the distribution parameter are obtained using neural network training method training described above；

Classification judging unit, for judging classification belonging to the sample data based on prediction categorical data.

A kind of electronic equipment, including processor and memory are finally provided, the memory is stored with control program, institute It states when control program is executed by processor and realizes neural network training method described above, alternatively, realizing pair described above As recognition methods.

Additionally, it is provided a kind of computer readable storage medium, is stored with instruction, described instruction is loaded and is executed by processor Neural network training method Shi Shixian described above, alternatively, realizing object identifying method described above.

Detailed description of the invention

Fig. 1 is the flow chart according to the neural network training method of the embodiment of the present invention；

Fig. 2 is the flow chart according to neural network training method step 30 one embodiment of the embodiment of the present invention；

Fig. 3 is the visualization picture according to the differentiation distance matrix of the embodiment of the present invention；

Fig. 4 is the flow chart according to another embodiment of the neural network training method step 30 of the embodiment of the present invention；

Fig. 5 is the distribution map of the sample characteristics obtained by the training of existing loss function；

Fig. 6 is the distribution map for the sample characteristics that loss function training through the invention obtains；

Fig. 7 is the flow chart according to the object identifying method of the embodiment of the present invention；

Fig. 8 is the structural block diagram according to the neural metwork training device of the embodiment of the present invention；

Fig. 9 is the structural block diagram according to the object recognition equipment of the embodiment of the present invention；

Figure 10 is the structural block diagram according to a kind of electronic equipment of the embodiment of the present invention；

Figure 11 is the block diagram according to another electronic equipment of the embodiment of the present invention.

Description of symbols:

1- feature extraction unit, 2- parameter setting unit, 3- costing bio disturbance unit, 4- parameter adjustment unit, 5- feature are defeated Unit out, 6- classification computing unit, 7- classification judging unit, 12- electronic equipment, 14- external equipment, 16- processing unit, 18- Bus, 20- network adapter, 22- input/output (I/O) interface, 24- display, 28- system storage, 30- arbitrary access Memory, 32- cache memory, 34- storage system, 40- utility, 42- program module.

Specific embodiment

To make the above purposes, features and advantages of the invention more obvious and understandable, with reference to the accompanying drawing to the present invention Specific embodiment be described in detail.

Obviously, embodiment described is a part of the embodiment of the present invention, instead of all the embodiments.Based on the present invention In embodiment, the every other embodiment that those skilled in the art obtain without making creative work, all Belong to the scope of protection of the invention.

In order to make it easy to understand, in the present invention, need to wherein the technical issues of and technical principle be described in detail.

Generally classified using deep neural network to picture, needs to be divided into deep neural network training part and depth Neural network judgment part: in deep neural network training part, first general loss function is set according to picture, is then obtained A large amount of picture sample is randomly provided the parameter of deep neural network, and picture sample input deep neural network is trained, And calculate total losses function；After the numerical value for obtaining total losses function, according to the numerical value percentage regulation neural network of loss function Parameter, after adjustment again by picture sample input deep neural network be trained, in this way recycle, up to total losses function value Until convergence, so that deep neural network training be completed；Then deep neural network judgment part can be carried out, that is to say logical It crosses deep neural network to classify to picture, picture sample is inputted into trained deep neural network, according to output result Judge classification described in picture sample.

By the assorting process of above-mentioned deep neural network, we it can be found that wherein most critical the problem of be loss function Design and deep neural network training framework.The two profoundly affects the final training effect of deep neural network and divides The accuracy of class.

In existing deep neural network, sample inputs deep neural network, will export the corresponding distance of each type Parameter, the corresponding maximum type of distance parameter are the type of the sample identified；Either deep neural network least significant end It can the corresponding numerical value of each type of output one, the corresponding maximum type of numerical value be directly to identify for full articulamentum The type of sample.The distance parameter or numerical value of this output, not intuitive with the mapping relations of specific category, interpretation compares Difference, therefore be difficult accurately to embody the specific mapping of classification；This design and deep neural network training framework in loss function On, do not consider the distribution characteristics of picture sample itself, final training is improved only with the mode of optimization loss function and is imitated The accuracy of fruit and classification；This allows for having arrived the promotion of the accuracy of training effect and classification certain bottleneck, final point The accuracy of class is difficult to meet growing current demand.

The embodiment of the present disclosure provides a kind of neural network training method, this method can by neural metwork training device Lai It executes, which can integrate in computer, the electronic equipments such as server.As shown in Figure 1, it is according to this The flow chart of the neural network training method of inventive embodiments；Wherein, the neural network training method, comprising:

Step 10, sample data is inputted into deep neural network, extracts the feature vector of the sample；

Wherein, the parameter of the deep neural network can be set at random, can also rule of thumb be set, from And it reduces the time for needing training or reduces the sample size for participating in training.

Step 20, sample class quantity is obtained, the described eigenvector for setting each classification sample obeys independent Gaussian point Cloth, and set the corresponding distribution parameter of each classification；

Sample class quantity is to determine that (known) when obtaining the data set of sample data, therefore can pass through reading The modes such as take to directly obtain.The classification of sample can also can be adjusted according to demand determines according to actual conditions.

In the application, assume that the feature vector of each classification sample in these classifications obeys independent Gaussian distribution 's；For obeying the feature of independent Gaussian Profile, the distribution parameter for needing to set is mean μ and variances sigma.

In this step, the corresponding distribution parameter of each classification can set at random, can also rule of thumb be set It is fixed, to reduce the time for needing training or reduce the sample size for participating in training.

For example, can determine that sample being divided into 30 classifications before starting neural metwork training；Then this 30 classifications point Cloth has respective distribution parameter, that is to say respective mean μ and variances sigma；The corresponding distribution parameter of each classification is set, as Set 30 mean μs and variances sigma of 30 classifications.

Step 30, it is based on described eigenvector and the distribution parameter, calculates the value of total losses function；

Wherein, loss function is as unit of single sample；Each sample has a sample losses, total losses letter Several values is the average value of all sample losses；Wherein, the sample losses of sample are by described eigenvector and the distribution Parameter entrance loss function obtains.

Step 40, the value based on the total losses function adjusts the parameter and distribution ginseng of the deep neural network Number, until the value of the total losses function restrains.

Value based on the total losses function adjusts the parameter of the deep neural network and when the distribution parameter, can be with It is iterated by the more new samples of small lot.For example there are 10,000 sample datas, 100 sample inputs of selection are deep every time Neural network is spent, and calculates the value of the total losses function of this 100 samples, passes through the value percentage regulation nerve net of total losses function The parameter of network and distribution parameter of all categories；Then 100 samples of reselection input deep neural network adjusted, follow in this way Ring iterative is updated, until the value convergence of the total losses function.

Optionally, interval in classification interval and class is also set in the step 30, and is based on the classification interval, the class Interior interval, described eigenvector and the distribution parameter, calculate the value of total losses function.

Wherein hyper parameter is divided between interval and generic in the class；Based on the classification interval, it is spaced in the class The value for calculating total losses function, is to influence total losses by divergence and class scatter in classification interval and interior spacing constraint class Function, so that same category is bound and is divided, it is small big with class scatter to constrain divergence in class, to reach more acurrate Training effect.

Optionally, as shown in Figure 2, Figure 4 shows, the step 30 includes:

Step 31, the classification interval, the interior interval of the class are set；

Interval is whether to belong to same class to different samples to be bound and divide in the class, the mesh being spaced in setting class , be so that the difference belonged between same class another characteristic answer it is as small as possible；The classification interval, for characterize classification it Between gap.

Step 32, based on interval in described eigenvector, the distribution parameter and the class, computational discrimination distance matrix；

It should be noted that the differentiation distance matrix is made of the differentiation distance of sample to each classification, the differentiation away from It is 1 × K matrix from matrix, is made of the differentiation distance of sample to K classification.

Step 33, the differentiation distance matrix and the classification interval calculation Classification Loss are based on；

Based on distribution parameter, Classification Loss is calculated indirectly, can establish the accurate corresponding of distribution parameter and Classification Loss Relationship reaches better training effect so that the influence that loss function adjusts distribution parameter is maximum.

Step 34, the true classification based on sample and the distribution parameter calculate likelihood loss；

Likelihood loss, it is much like with statistical expectation；It is desirable that the parameter that e-learning arrives can be in statistics With the true mean value of small lot sample and variance as close as so having used the loss of likelihood item as constraint in meaning； If individually defining the loss of mean value and variance from loss function, does not have statistical significance, loss function and sample sheet can be made The feature corresponding relationship inaccuracy of body；It is lost by setting likelihood, is built from angle of statistics by distribution parameter equal to loss function Accurate corresponding relationship is found, to reach higher training effect.

Wherein, the true classification of sample is known when being trained to neural network, to pass through true classification Classification Loss is calculated with classification is inferred, and then determines the value of total losses function.

Step 35, it is lost based on the Classification Loss and likelihood, calculates the value of total losses function.

In this step, by the Classification Loss of each sample and the sample losses of each sample of likelihood costing bio disturbance, then The value of total losses function is calculated by the sample losses of each sample.

Step 32-34 can determine the value of total losses function by feature vector, true classification and distribution parameter；Pass through Distribution parameter calculates total losses parameter, and the corresponding relationship of distribution parameter and loss function can be established from angle of statistics, thus So that the corresponding relationship of distribution parameter and total losses function is more accurate, so that total losses function can more reflect actual classification Order of accuarcy；In this way, can achieve better training effect by total losses function.

As shown in figure 3, it is the visualization picture for differentiating distance matrix；Wherein, in the visualization picture, abscissa is instruction Classification after white silk, ordinate representative sample concrete class, what box represented is the distance of the classification after concrete class to training, face Color is deeper, represents apart from smaller.In the visualization picture, class has been arrived apart from the study of minimum specification deep neural network on diagonal line Not corresponding center, has reached extraordinary training effect.

Optionally, in the step 31, β > 0 is spaced in the classification interval α > 0, the class.

When classifying to sample, the sample is calculated by deep neural network and other parts and belongs to each class Other probability.In these probability, always having the sample to belong to the probability of two classifications can be very close；Classification interval α > 0 is exactly Gap between two classifications similar in fuzzy probability, so that the loss for being easiest to obscure part is become larger.Such as to figure When number in piece is identified, if a digital picture sample, the probability for belonging to 0 (classification) and belonging to 8 (classifications) connect very much Closely, then classification interval α > 0, exactly belong to 0 (classification) and belong to 8 (classifications) probability it is closer, thus being easiest to mix This partial loss confused becomes larger；It, will in order to restrain the value of loss function so when being trained to deep neural network By modifying the parameter of deep neural network, to reduce above-mentioned loss as far as possible, the ginseng by modifying deep neural network that is to say Number comes so that the digital picture sample belongs to 0 (classification) and belongs to the probability of 8 (classifications) does not reaccees, i.e., by learning 0 class Not and 8 classifications are more open.

When classifying to sample, the difference belonged between same class another characteristic is answered as small as possible；β is spaced in class > 0 is whether to belong to same class to different samples to be bound and divide, and it is other to reduce same class by this division Difference between sample characteristics；For the above example, when the number in picture is 0, even if the display styles of number are different Sample can make last output difference smaller by being spaced β in class.

Optionally, described to differentiate distance matrix by the differentiation distance d of sample to each classification_kComposition differentiates distance d_kMeter Calculate formula are as follows:

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, d_kFor the differentiation distance of sample and k-th of classification, f_iFor i-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics,It is corresponding for i-th of sample characteristics The variance of true classification, μ_kThe mean value of k-th of classification, β are interval in class, and β is hyper parameter.

It should be noted that described differentiate distance matrix by the differentiation distance d of sample to each classification_kComposition, the differentiation Distance matrix is 1 × K matrix, is made of the differentiation distance of sample to K classification.

Optionally, the calculation formula of the Classification Loss are as follows:

In formula, L_clsFor Classification Loss, N is sample size, and i is sample serial number, and K is classification number, and k is classification sequence number, f_iFor I-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics, d_kFor sample and k-th of classification differentiation away from From α is classification interval, and α is hyper parameter.

Wherein, function R (k=C (f_i)) in, when classification is true classification, value 1, in the case of remaining, value 0.

In this way, making Classification Loss that can more reflect true loss, so that training effect is more preferable, classify more acurrate.

Optionally, the calculation formula of the likelihood loss are as follows:

It should be noted that feature f, mean μ, variances sigma is multi-C vector in above-mentioned calculation formula.

In above-mentioned formula, the gradient that distribution parameter μ and σ update loses two parts, likelihood from Classification Loss and likelihood Loss is to make distribution parameter in the case where guaranteeing nicety of grading as far as possible, the average statistical defined by statistics and statistics side Difference and the loss between parameter μ and the difference of σ.

As shown in figure 5, it is the distribution map of the sample characteristics obtained by the training of existing loss function；As shown in fig. 6, its Distribution map for the sample characteristics obtained by the training of the loss function of the application.Therefrom, it is apparent that having compared existing Loss function, the setting of the loss function in the application, so that the distribution variance of feature is smaller, the aliasing between classification is less, It is easier to make the high judgement of accuracy rate, so as to greatly improve the accuracy of sample classification.

The embodiment of the present disclosure provides a kind of object identifying method, and this method can be executed by object recognition equipment, should Object recognition equipment can integrate in computer, the electronic equipments such as server.As shown in fig. 7, it is according to the embodiment of the present invention Object identifying method flow chart；Wherein, the object identifying method, comprising:

Step 100, sample data is inputted into deep neural network, extracts the feature vector of the sample；The depth mind It is obtained through network and the distribution parameter using neural network training method training described above；

In this object identifying method, the particular content of step 100 can be with reference to step 10 in neural network training method It specifically describes, details are not described herein.

Wherein, the parameter of deep neural network and the distribution parameter are instructed using neural network training method described above Practice and obtains；In this way, first carrying out learning training to deep neural network and distribution parameter, then pass through trained depth nerve net Network and distribution parameter judge sample generic；So as to accurately be classified to sample.

Step 200, it is based on described eigenvector and the corresponding distribution parameter of each classification, calculates prediction categorical data；

Wherein, the prediction categorical data is the data for reacting sample generic, for example, can belong to respectively for sample The probability of a classification, sample is at a distance from each classification etc..

Step 300, classification belonging to the sample data is judged based on prediction categorical data.

If prediction categorical data is that sample belongs to probability of each classification etc. and can intuitively embody the number of sample generic According to then the classification of maximum probability (other visual datas carry out similar judgement) being judged as the generic of the sample； It, can be according to data and generic if prediction categorical data is that other can not intuitively embody the data of sample generic Indirect corresponding relationship is judged, or is judged again after being translated into direct corresponding data.

In this way, by step 100-300, can be to sample extraction feature after, sentenced according to the feature of extraction and distribution parameter Classification belonging to disconnected sample, and deep neural network and distribution parameter due to carrying out feature extraction, by above-mentioned neural network Training method training obtains, therefore more accurate to the judgement of sample generic.

In addition, the feature distribution in classification is considered as independent Gaussian Profile, to pass through neural network training method pair Distribution parameter is trained, and passes through the generic of the distribution parameter judgement sample after training；In this way, being instructed to distribution parameter When experienced and judgement sample generic, the characteristics of considering classification feature distribution itself, so as to distribution parameter and depth The judgement for spending neural network is more targeted, and the distribution parameter and deep neural network that training obtains can more accurately judge The generic of sample.

Optionally, in the step 200, the prediction categorical data is Prediction distance matrix.In this way, can be according to feature Vector sum distribution parameter, directly calculating Prediction distance matrix, to reflect sample characteristics to respectively by Prediction distance matrix The centre distance of a classification, wherein the smallest classification of the centre distance of sample characteristics to classification, the as generic of sample.

It should be noted that Prediction distance D of the Prediction distance matrix by sample to each classification_kComposition, the differentiation Distance matrix is 1 × K matrix, is made of the Prediction distance of sample to K classification.

Optionally, the Prediction distance matrix by sample to each classification Prediction distance D_kComposition, Prediction distance D_kMeter Calculate formula are as follows:

Wherein, D_kFor the Prediction distance of sample and k-th of classification, d_kFor the differentiation distance of sample and k-th of classification；The two Closely similar, difference when being when the same formula is used for neural metwork training and for Object identifying changes, wherein in object When identification, since the true classification of sample is unknown, so substituting the side of the true classification of sample using the variance of k-th of classification Difference.

Wherein, the Prediction distance matrix by sample to each classification Prediction distance D_kComposition, sample to each classification Prediction distance D_k, statement is the distance of the sample that is predicted according to object identifying method to each classification, apart from the smallest Classification, the as generic of sample.The generic, for the classification predicted by the object identifying method.

Wherein, for the ease of judgement, the negative value of Prediction distance Input matrix can be inputted into softmax function, output The corresponding classification of softmax maximum value, the as generic of sample.The generic is pre- by the object identifying method The classification of survey.

Wherein, softmax function is the common sense of those skilled in the art, and details are not described herein.

It needing to be illustrated, in this object identifying method, classification interval α, the interior interval β of class belongs to hyper parameter, and Classification interval α=0, class is interior to be spaced β > 0, and the interior interval β of the class of interval β and above-mentioned neural network training method keeps one in class It causes.

In this way, classification interval α is zeroed, it can reduce in class and be lost caused by interval to avoid variety classes are obscured again；In class Whether interval β is to belonging to same category and be bound and divide, with being consistent when training, in this way to sample generic When being judged, same standard can be just used, guarantees the accuracy of judgement；If inconsistent, in training, sample category In a classification, in prediction, same sample may be divided into another classification, this will make the affiliated class to sample It does not judge incorrectly.

Optionally, the prediction categorical data can belong to the probability of each classification for sample characteristics.In this way, can be direct It is calculated according to the feature vector of deep neural network output and distribution parameter, without being judged after inputting other functions, More directly, fast.

Optionally, sample characteristics belong to the probability of each classification calculation method it is as follows, if the output of deep neural network Sample characteristics are f, then can calculate its probability for belonging to classification j are as follows:

K is classification number, and k is classification sequence number, and j is classification sequence number, and p (j | f) it is characterized the probability that f belongs to classification j, μ_jIt is The mean value of j classification,For the variance of j-th of classification, μ_kFor the mean value of k-th of classification,For the variance of k-th of classification, p (j) it is characterized the prior probability for belonging to jth class, p (k) is characterized the prior probability for belonging to kth class, and D is Multi-dimensional Gaussian distribution Dimension.

Wherein,It represents f and obeys mean value as μ_k, variance isNormal distribution,

Wherein, p (j) is that this feature belongs to the prior probability of jth class, when without prior probability if may be assumed that all categories Prior probability is equal, i.e. p (j)=1/K.

Wherein, p (k) is that this feature belongs to the prior probability of kth class, when without prior probability if may be assumed that all categories Prior probability is equal, i.e. p (k)=1/K.

The embodiment of the present disclosure provides a kind of neural metwork training device, for executing mind described in above content of the present invention Through network training method, the neural metwork training device is described in detail below.

As shown in figure 8, it is the structural block diagram according to the neural metwork training device of the embodiment of the present invention；Wherein, described Neural metwork training device, comprising:

Feature extraction unit 1 extracts the feature vector of the sample for sample data to be inputted deep neural network；

Parameter setting unit 2, for obtaining sample class quantity, the described eigenvector for setting each classification sample is obeyed Independent Gaussian distribution, and set the corresponding distribution parameter of each classification；

Costing bio disturbance unit 3 calculates the value of total losses function for being based on described eigenvector and the distribution parameter；

Parameter adjustment unit 4, for the value based on the total losses function adjust the deep neural network parameter and The distribution parameter, until the value of the total losses function restrains.

Optionally, the costing bio disturbance unit 3 is also used to set interval in classification interval and class, and is based on the classification Interval, described eigenvector and the distribution parameter in interval, the class, calculate the value of total losses function.

Optionally, the costing bio disturbance unit 3, is also used to: being sentenced based on described eigenvector and the distribution parameter, calculating Other distance matrix；Classification Loss is calculated based on the differentiation distance matrix；True classification and distribution parameter based on sample calculate Likelihood loss；It is lost based on the Classification Loss and likelihood, calculates the value of total losses function.

Optionally, the costing bio disturbance unit 3, is also used to: it sets and is spaced β in the classification interval α, the class, and α > 0,β>0。

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, d_kFor the corresponding true classification of sample characteristics and The differentiation distance of k classification, f_iFor i-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics, The variance of true classification, μ are corresponded to for i-th of sample characteristics_kThe mean value of k-th of classification, β are interval in class, and β is hyper parameter.

Optionally, the calculation formula of the Classification Loss are as follows:

Optionally, the calculation formula of the likelihood loss are as follows:

The embodiment of the present disclosure provides a kind of object recognition equipment, knows for executing object described in above content of the present invention Other method, is below described in detail the object recognition equipment.

As shown in figure 9, it is the structural block diagram according to the object recognition equipment of the embodiment of the present invention；Wherein, Object identifying Device, comprising:

Feature output unit 5 extracts the feature vector of the sample for sample data to be inputted deep neural network；

Classification computing unit 6 calculates prediction class for being based on described eigenvector and the corresponding distribution parameter of each classification Other data；The deep neural network and the distribution parameter are obtained using neural network training method training described above；

Classification judging unit 7, for judging classification described in the sample data based on prediction categorical data.

Optionally, the prediction categorical data is to differentiate distance matrix.In this way, can be joined according to feature vector and distribution Number, direct computational discrimination distance matrix, to reflect sample characteristics to the center of each classification by differentiating distance matrix Distance, wherein the smallest classification of the centre distance of sample characteristics to classification, the as generic of sample.

It should be noted that the apparatus embodiments described above are merely exemplary, for example, the unit is drawn Point, only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or group Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown Or the mutual coupling, direct-coupling or communication connection discussed can be through some communication interfaces, device or unit Indirect coupling or communication connection can be electrical property, mechanical or other forms.

The foregoing describe the built-in functions and structure of neural metwork training device, object recognition equipment, as shown in Figure 10, real In border, the neural metwork training device, object recognition equipment can be realized as electronic equipment, comprising: processor and memory, institute It states memory and is stored with control program, the control program realizes neural metwork training side described above when being executed by processor Method, alternatively, realizing object identifying method described above.

Figure 11 is the block diagram of another electronic equipment shown according to embodiments of the present invention.For example, electronic equipment 800 can be with It is computer, server, terminating machine, digital broadcasting terminal, messaging device, etc..

The electronic equipment 12 that Figure 11 is shown is only an example, should not function and use scope to the embodiment of the present application Bring any restrictions.

As shown in figure 11, electronic equipment 12 can be realized in the form of universal electronic device.The component of electronic equipment 12 can be with Including but not limited to: one or more processor or processing unit 16, system storage 28 connect different system components The bus 18 of (including system storage 28 and processing unit 16).

Bus 18 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (Industry Standard Architecture；Hereinafter referred to as: ISA) bus, microchannel architecture (Micro Channel Architecture；Below Referred to as: MAC) bus, enhanced isa bus, Video Electronics Standards Association (Video Electronics Standards Association；Hereinafter referred to as: VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection；Hereinafter referred to as: PCI) bus.

Electronic equipment 12 typically comprises a variety of computer system readable media.These media can be it is any can be electric The usable medium that sub- equipment 12 accesses, including volatile and non-volatile media, moveable and immovable medium.

Memory 28 may include the computer system readable media of form of volatile memory, such as random access memory Device (Random Access Memory；Hereinafter referred to as: RAM) 30 and/or cache memory 32.Electronic equipment 12 can be into One step includes other removable/nonremovable, volatile, nonvolatile computer readable storage mediums.Only as an example, Storage system 34 can be used for reading and writing immovable, non-volatile magnetic media and (not show in figure, commonly referred to as " hard drive Device ").Although being not shown in Figure 11, the disk for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided Driver, and to removable anonvolatile optical disk (such as: compact disc read-only memory (Compact Disc Read Only Memory；Hereinafter referred to as: CD-ROM), digital multi CD-ROM (Digital Video Disc Read Only Memory；Hereinafter referred to as: DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving Device can be connected by one or more data media interfaces with bus 18.Memory 28 may include that at least one program produces Product, the program product have one group of (for example, at least one) program module, and it is each that these program modules are configured to perform the application The function of embodiment.

Program/utility 40 with one group of (at least one) program module 42 can store in such as memory 28 In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.Program module 42 is usual Execute the function and/or method in embodiments described herein.

Electronic equipment 12 can also be with one or more external equipments 14 (such as keyboard, sensing equipment, display 24 etc.) Communication can also enable a user to the equipment interacted with the computer system/server 12 communication with one or more, and/or With the computer system/server 12 is communicated with one or more of the other electronic equipment any equipment (such as Network interface card, modem etc.) communication.This communication can be carried out by input/output (I/O) interface 22.Also, electronics is set Standby 12 can also pass through network adapter 20 and one or more network (such as local area network (Local Area Network；With Lower abbreviation: LAN), wide area network (Wide Area Network；Hereinafter referred to as: WAN) and/or public network, for example, internet) it is logical Letter.As shown, network adapter 20 is communicated by bus 18 with other modules of electronic equipment 12.Although being noted that It is not shown in the figure, other hardware and/or software module can be used in conjunction with electronic equipment 12, including but not limited to: microcode is set Standby driver, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system System etc..

Processing unit 16 by the program that is stored in system storage 28 of operation, thereby executing various function application and Data processing, such as realize the method referred in previous embodiment.

Electronic equipment of the invention can be server, can also limited calculation power terminal device, lightweight of the invention Network structure is particularly suitable for the latter.The matrix realization of the terminal device includes but is not limited to: intelligent mobile communication terminal, nothing Man-machine, robot, portable image processing equipment, security device etc..

The embodiment of the present disclosure provides a kind of computer readable storage medium, is stored with instruction, described instruction is by processor Neural network training method described above is loaded and realizes when executing, alternatively, realizing object identifying method described above.

The technical solution of the embodiment of the present invention substantially the part that contributes to existing technology or the technology in other words The all or part of scheme can be embodied in the form of software products, which is stored in a storage and is situated between In matter, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment Deng) or processor (processor) execute the method for the embodiment of the present invention all or part of the steps.And storage above-mentioned is situated between Matter includes: the various media that can store program code such as USB flash disk, mobile hard disk, ROM, RAM, magnetic or disk.

Although the disclosure discloses as above, the protection scope of the disclosure is not limited only to this.Those skilled in the art are not Under the premise of being detached from spirit and scope of the present disclosure, it can make various changes and modify, these changes will fall into this with modification The protection scope of invention.

Claims

1. a kind of neural network training method characterized by comprising

Sample class quantity is obtained, the described eigenvector for setting each classification sample obeys independent Gaussian distribution, and sets every The corresponding distribution parameter of a classification；

Value based on the total losses function adjust the deep neural network parameter and the distribution parameter, until described total The value of loss function restrains.

2. neural network training method according to claim 1, which is characterized in that described to be based on described eigenvector and institute State distribution parameter, in the value for calculating total losses function, also interval in setting classification interval and class, and based on the classification interval, Interval, described eigenvector and the distribution parameter in the class, calculate the value of total losses function.

3. neural network training method according to claim 2, which is characterized in that between in the setting classification interval and class Every, and based on interval, described eigenvector and the distribution parameter in the classification interval, the class, calculate total losses function Value, comprising:

Set the classification interval, the interior interval of the class；

4. neural network training method according to claim 3, which is characterized in that the classification interval α > 0, the class Interior interval β is greater than 0.

5. neural network training method according to claim 3, which is characterized in that the differentiation distance matrix by differentiate away from From d_kComposition differentiates distance d_kCalculation formula are as follows:

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, d_kFor the differentiation distance of sample and k-th of classification, f_iFor I-th of sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics,It is corresponding true for i-th of sample characteristics The variance of classification, μ_kThe mean value of k-th of classification, β are interval in class, and β is hyper parameter.

6. neural network training method according to claim 3, which is characterized in that the calculation formula of the Classification Loss Are as follows:

In formula, L_clsFor Classification Loss, N is sample size, and i is sample serial number, and K is classification number, and k is classification sequence number, f_iIt is i-th Sample characteristics, C (f_i) it is the corresponding true classification of i-th of sample characteristics, d_kFor the corresponding true classification of sample characteristics and k-th The differentiation distance of classification, α are classification interval, and α is hyper parameter.

7. neural network training method according to claim 3, which is characterized in that the calculation formula of the likelihood loss Are as follows:

In formula, L_lkdFor likelihood loss, N is sample size, and i is sample serial number, f_iFor i-th of sample characteristics, C (f_i) it is i-th The corresponding true classification of sample characteristics, μ_C(f_i) it is the mean value that i-th of sample characteristics corresponds to true classification,For i-th of sample Eigen corresponds to the variance of true classification.

8. a kind of object identifying method characterized by comprising

Sample data is inputted into deep neural network, extracts the feature vector of the sample, it is corresponding described to obtain each classification Distribution parameter；The deep neural network and the distribution parameter use such as nerve net of any of claims 1-7 The training of network training method obtains；

9. object identifying method according to claim 8, which is characterized in that the prediction categorical data is Prediction distance square Battle array, the Prediction distance matrix by sample to each classification Prediction distance D_kComposition, the Prediction distance D_kCalculation formula Are as follows:

In formula, K is classification number, and k is classification sequence number, and i is sample serial number, D_kFor the Prediction distance of sample and k-th of classification, f_iFor The feature vector of i-th of sample,For the variance of k-th of classification, μ_kThe mean value of k-th of classification, β are interval in class, and β is super Parameter.

10. object identifying method according to claim 9, which is characterized in that in the Prediction distance matrix, the sample The smallest classification of the Prediction distance of data to each classification is classification belonging to the sample data.

11. a kind of neural metwork training device characterized by comprising

Feature extraction unit (1) extracts the feature vector of the sample for sample data to be inputted deep neural network；

Parameter setting unit (2), for obtaining sample class quantity, the described eigenvector for setting each classification sample is obeyed solely Vertical Gaussian Profile, and set the corresponding distribution parameter of each classification；

Costing bio disturbance unit (3) calculates the value of total losses function for being based on described eigenvector and the distribution parameter；

Parameter adjustment unit (4) adjusts parameter and the institute of the deep neural network for the value based on the total losses function Distribution parameter is stated, until the value of the total losses function restrains.

12. a kind of object recognition equipment characterized by comprising

Feature output unit (5) extracts the feature vector of the sample for sample data to be inputted deep neural network；

Classification computing unit (6) calculates prediction classification for being based on described eigenvector and the corresponding distribution parameter of each classification Data；The deep neural network and the distribution parameter use neural network of any of claims 1-7 such as to instruct Practice method training to obtain；

Classification judging unit (7), for judging classification belonging to the sample data based on prediction categorical data.

13. a kind of electronic equipment, including processor and memory, which is characterized in that the memory is stored with control program, The control program realizes the neural network training method as described in any in claim 1-7 when being executed by processor, alternatively, Realize the object identifying method as described in any in claim 8-10.

14. a kind of computer readable storage medium is stored with instruction, which is characterized in that described instruction is loaded and held by processor The neural network training method as described in any in claim 1-7 is realized when row, alternatively, realizing as appointed in claim 8-10 Object identifying method described in one.