CN105678340A

CN105678340A - Automatic image marking method based on enhanced stack type automatic encoder

Info

Publication number: CN105678340A
Application number: CN201610035975.7A
Authority: CN
Inventors: 柯逍; 周铭柯; 杜明智
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2016-01-20
Filing date: 2016-01-20
Publication date: 2016-06-15
Anticipated expiration: 2036-01-20
Also published as: CN105678340B

Abstract

The invention relates to an automatic image marking method based on an enhanced stack type automatic encoder. A balanced stack type automatic encoder which can improve the accuracy of low-frequency labels is provided to solve the problem that a traditional SAE model in deep learning cannot train a biased data set effectively, and the marking effect of the low-frequency labels is improved. An enhanced stack type automatic encoder aimed at an image marking task is provided to solve the problem that a single B-SAE model is instable and causes that the marking effect changes greatly with parameters, groups are trained sequentially, and optimal B-SAE sub models of the groups are weighted and accumulated, and a stable marking result is obtained. According to the method, the weights are trained layer by layer, a backward spreading algorithm is used for whole optimization, the problems including that a traditional shallow layer model is weak in the generalization capability and an optimal extreme point is hard to converge to are solved, training of weak label samples are strengthened in the training process, the marking effect of the whole model is improved, and the method is simple, flexible and highly practical.

Description

A kind of automatic image marking method based on enhancement type stack compiler

Technical field

The present invention relates to pattern recognition and computer vision field, particularly a kind of automatic image marking method based on enhancement type stack compiler.

Background technology

Along with multimedia image technology is fast-developing, on internet, graphic information is explosive increase. The application of these digital pictures widely, such as aspects such as business, news media, medical science, education. Therefore, how to help user to find one of the image of needs hot subject becoming multi-media research in recent years quickly and accurately. And solve the most important technology of this problem be exactly image retrieval and automatic image annotation technology.

Automatic image annotation refers to and automatically adds some keywords to represent the semantic content of image to image. Automatic image annotation can utilize the image collection marked, automatically the relational model in the semantic concept space of study and visual signature space, and mark the image of unknown semantics with this model. On the one hand, automatic image annotation is attempted to setting up a bridge block between high-level semantics features and bottom visual signature, and therefore, it can to a certain degree solve the semantic gap problem that most of Content-Based Image Retrieval method exists, and has good objectivity. On the other hand, automatic image annotation can generate the Word message relevant to image content, has better accuracy. If automatic image annotation can be realized, in fact so existing image retrieval problem can change into more ripe text retrieval problem. Therefore, automatic image annotation technology can realize the image based on keyword easily and retrieve, and meets the retrieval habit of people. Generally speaking, automatic image annotation relate to computer vision, machine learning, information retrieval etc. in many ways and content, there is very strong researching value and potential commercial applications, as image classification, image retrieval, image are understood and intelligent image analysis etc.

Main implementation feature according to existing automatic image marking method, it is possible to be divided into two classes: the mask method based on probability-statistics and the mask method based on machine learning.Although the method based on probability-statistics can expand to large data sets very easily, but overall mark effect is not ideal enough. Based on the method for machine learning, once model training is complete, so that it may to mark fast, and the learning methods such as current most classification, recurrence are shallow structure algorithm, are subject to certain restriction for complexity classification its generalization ability of problem. In recent years, degree of depth study, as the innovation algorithm of machine learning, extensively for Target Recognition, image classification, speech recognition and other field, but rarely has and is applied in image labeling problem. Owing to degree of depth study can train deep layer, complicated model, the big data problem of process has great advantage. DBN and CNN these two identification tasks that model is less at label, feature is simple, feature is complete can obtain better effects, and image labeling problem label is numerous, characteristics of image is various and complicated, and real world images also exists the noise problems such as a large amount of each class text, network address, Quick Response Code and image watermark, greatly have impact on the effect of DBN and CNN. And SAE network, more focus on the approximate expression between feature, it is easy to the input of complexity is expressed as desirable output and is applied to particular condition by adjustment model, therefore, this patent selects SAE model to solve image labeling problem.

Summary of the invention

It is an object of the invention to provide a kind of automatic image marking method based on enhancement type stack compiler, to overcome in prior art the defect existed, solve the automatic image annotation problem for the many labels of multi-object.

For achieving the above object, the technical scheme of the present invention is: a kind of automatic image marking method based on enhancement type stack compiler, realizes in accordance with the following steps:

Step S1: build stack compiler model, differentiate weak exemplar on described stack compiler model, and add noise to increase the training number of times of described weak exemplar, and then build balance stack compiler model;

Step S2: by described balance stack compiler model to training image grouping training quantum balancing stack compiler model, the weighting each group of optimum submodel that add up is enhanced balance stack compiler model;

Step S3: unknown images is input to described enhancing balance stack compiler model and exports annotation results.

In an embodiment of the present invention, in described step S1, also comprise the steps:

Step S11: definition encoder f_θWith demoder g_θ'; Described encoder f_θInput picture x is converted to hidden layer and expresses h, demoder g_θ'By described hidden layer express h be reconstructed into the vector x consistent with described input picture x dimension '; Wherein, f_θX ()=σ (W x+b), θ={ W, b}, W are network weight, meet W'=W^T, b is bias vector,

σ (x) = \frac{1}{(1 + e^{(- x)})}

For activation function;θ '={ W', b'};

Step S12: learn a function and make output x'=g_θ'(f_θ(x)) and described input picture x be similar to, and definition loss function be L (x, x')=(x-x')², and learnt by minimum losses function:

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} L (x_{i}, g_{θ^{'}} (f_{θ} (x_{i})));

Step S13: note be used for image labeling SAE model have L layer, and with sequence number l ∈ 1 ..., L} represents; Use h^lRepresent the output vector of l layer, W^lAnd b^lRepresent the network weight of l layer and it is biased, by compiler to { W^l,b^l, l ∈ 1 ..., L} is by layer pre-training;

Step S14: perform feed forward process and use Back Propagation Algorithm tuning; The feedforward operation of described stack compiler model is expressed as: h^l+1=σ (W^l+1h^l+b^l+1), l ∈ 0 ..., L-1};The Back Propagation Algorithm tuning of described stack compiler model is expressed as:Wherein,It is the synthesis function of multiple compiler model, and θ_lFor parameter { W^l,b^l, l ∈ 1 ..., L}, loss function is L (x, y)=(x-y)²;

Step S15: definition bound variable, order vector C=(c₁,c₂,...,c_M),Represent keyword y_iThe number of times occurred in training set P,Represent the average occurrence number of keyword; Vector C=(c₁,c₂,...,c_M) represent the i-th width image x_iEach keyword Y_i ^j, j ∈ 1,2 ..., the number of times Y that M} occurs in training set_C,i=C*Y_i; Thus obtain at image x_iThe keyword that middle occurrence number is minimum is

Λ_{x_{i}} = \arg \underset{j}{m i n} (Y_{C, i}^{j});

Step S16: definition Φ (x) function, learning sample is judged by described stack compiler model in the training process, if input picture x comprises the number of low frequency tags more than k, then this input picture x is added suitable noise; Definition Γ (x) function, increases training strength to input picture x, if the occurrence number that this input picture x is comprised label (is generally got lower than predetermined threshold value), then increase training number of times, wherein, function gamma (x) is:

Γ (x_{i}) = \{\begin{matrix} α \cdot \frac{Π}{Λ_{x_{i}}} = α \cdot \frac{\frac{1}{M} Σ_{j = 1}^{M} c_{j}}{\arg \min_{j} (Y_{C, i}^{j})}, & Λ_{x_{i}} < = β \cdot Π \\ 1, & O t h e r s \end{matrix},

Wherein, α and β is constant coefficient, and β is for determining the sample needing to increase the weight of training, and α is for controlling the training strength needing to increase the weight of the sample of training;

Function phi (x) is:

Φ (x_{i}) = \{\begin{matrix} χ \cdot (\frac{1}{d} Σ_{j = 1}^{d} x_{i}^{j}) \cdot R a n (\cdot), & Λ_{x_{i}} < = β \cdot Π \\ x_{i}, & O t h e r s \end{matrix},

Wherein, χ is constant coefficient, and for the intensity that control noises adds, d is image x_iThe dimension degree of feature,Represent image x_iThe value of jth dimension degree, Ran () is random number functions;

Step S17: adjusting and optimizing equation is balanced stack compiler model; Will

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} L (x_{i}, g_{θ^{'}} (f_{θ} (x_{i})))

It is adjusted to

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} {\frac{1}{Γ (x_{i})} Σ_{j = 1}^{Γ (x_{i})} L (Φ (x_{i}), g_{θ^{'}} (f_{θ} (Φ (x_{i}))))},

Will

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} L (F_{θ} (x_{i}), Y_{i})

It is adjusted to

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} Σ_{j = 1}^{Γ (x_{i})} L (F_{θ} (Φ (x_{i})), Y_{i}),

After model training is good, the output of last layer of described balance stack compiler model is the prediction distribution D of the keyword of predicted picture.

In an embodiment of the present invention, in described step S2, also comprise the steps:

Step S21: grouping training quantum balancing stack compiler model, makes an uproar balance stack compiler model the different group of model split by different adding, divides submodel according to different hidden neuron numbers in each groupT represents that balance stack compiler model adopts t kind to add the mode of making an uproar, and k represents the hidden neuron number that the sub-B-SAE model of kth is arranged;

Step S22: initial weight is set and calculates quantum balancing stack compiler model model classification specific inaccuracy, training data is arranged weights as follows:

W=(w₁₁,...,w_1i,...,w_1N),

w_{1 i} = \frac{1}{N}, i = 1, 2, ..., N,

CalculateClassification specific inaccuracy:

e_{k}^{t} = Σ_{i = 1}^{N} w_{t i} \cdot S g n (B_{-} {SAE}_{k}^{t} (x_{i}) &NotEqual; Y_{i}),

Wherein,

S g n (x) = \{\begin{matrix} 1, & x = t r u e \\ 0, & x = f a l s e \end{matrix}, B_{-} {SAE}_{k}^{t} (x_{i}) &NotEqual; Y_{i}

Represent: assume image x_iTrue tag collection Y_iComprise c keyword, and pass through modelPrediction obtains label collection Y_i ^*Number be also c, if Y_i=Y_i ^*, thenFor false, otherwise it is true;

Step S23: calculated equilibrium stack compiler model weight, and upgrade training data weights; According to all sons in groupThe classification specific inaccuracy of model, it is possible to obtain the Model B that this component class specific inaccuracy is minimum_-SAE^tAnd the classification specific inaccuracy e of correspondence^t, calculate B_-SAE^tWeight:After the model training of t group is complete, upgrading the weights of training data, to obtain the weight of next group model, the mode upgrading training data weights is as follows:

W_t+1={ w_t+1,1,...,w_t+1,i,...,w_t+1,N},

w_{t + 1, i} = \frac{w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B_{-} {SAE}^{t} (x_{i}))}}{Σ_{i = 1}^{N} w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B_{-} {SAE}^{t} (x_{i}))}}, i = 1, 2, ..., N;

Step S24: the cumulative quantum balancing stack compiler model of weighting is enhanced and balances stack compiler model, after all having trained when all groups, namely obtains keyword prediction distribution:

D = Σ_{t = 1}^{T} α^{t} \cdot B_{-} {SAE}^{t} (x) .

Compared to prior art, the present invention has following useful effect: a kind of automatic image marking method based on enhancement type stack compiler proposed by the invention, utilize the feature representation ability that SAE deep neural network is powerful, based on to automatic image annotation, the understanding of many labelings and stack compiler, propose for image data set label uneven, it is difficult to effectively train the automatic image marking method of the enhancement type stack compiler of the problems such as big view data, finally obtain a kind of deep layer, complicated automatic image annotation model, particularly a kind of automatic image marking method based on enhancement type stack compiler.The method is simple, it is achieved flexibly, practicality is stronger.

Accompanying drawing explanation

Fig. 1 is the schema of automatic image marking method based on enhancement type stack compiler in the present invention.

Embodiment

Below in conjunction with accompanying drawing, the technical scheme of the present invention is specifically described.

The present invention proposes a kind of automatic image marking method based on enhancement type stack compiler, first tradition SAE (StackedAuto-Encoder in learning for the degree of depth, SAE) model is difficult to effectively training the problem of inclined data set, a kind of balance stack compiler (BalanceStackedAuto-Encoder promoting low frequency tags accuracy rate is proposed, B-SAE), improve the mark effect of low frequency tags preferably. Then cause marking the problem that bigger change easily occurs effect with parameter change for single B-SAE model instability (model is complicated, parameter is more), a kind of enhancing for image labeling task balance stack compiler (EnhancedBalanceStackedAuto-Encoder is proposed, EB-SAE), trained according to the order of sequence by grouping, weighting adds up each group of optimum B-SAE submodel, obtains stable annotation results. Concrete steps are as follows:

S1: first build SAE model, then differentiate weak exemplar on SAE model and add the training number of times that noise increases weak exemplar, build B-SAE model with this;

S2: utilize step S1 to obtain B-SAE model to the training image grouping sub-B-SAE model of training, the weighting each group of optimum submodel that add up obtains EB-SAE model, as shown in Figure 1;

S3: unknown images is input to EB-SAE model that step S2 obtains and exports annotation results.

Further, in the present embodiment, realize building B-SAE model according to following step in step sl:

Step S11: definition encoder f_θWith demoder g_θ', encoder f_θInput picture x is converted to hidden layer and expresses h, demoder g_θ'H is reconstructed into the vector x consistent with x dimension '. f_θX ()=σ (W x+b), wherein, θ={ W, b}, W are network weight, meet W'=W^T, b is bias vector_, For activation function.Wherein, θ '={ W', b'}.

Step S12: learn a function and make output x'=g_θ'(f_θ(x)) and x be similar to, definition loss function be L (x, x')=(x-x')², then this model learns by minimum losses function:

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} L (x_{i}, g_{θ^{'}} (f_{θ} (x_{i}))) .

Step S13: perform feed forward process and also use Back Propagation Algorithm tuning, it is assumed that the SAE model for image labeling has L layer, with sequence number l ∈ 1 ..., L} represents. Use h^lRepresent the output vector (h of l layer⁰=x represents input, h^LRepresent and export). W^lAnd b^lRepresent the network weight of l layer and it is biased. According to noted earlier, { W^l,b^l, l ∈ 1 ..., L} uses AE by layer pre-training. The feedforward operation of SAE can be expressed as: h^l+1=σ (W^l+1h^l+b^l+1), l ∈ 0 ..., and L-1}, whole model Back Propagation Algorithm tuning:

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} L (F_{θ} (x_{i}), Y_{i}),

Wherein,

F_{θ} (x) = σ_{θ_{L}} (... (σ_{θ_{1}} (x)))

It is the synthesis function of multiple AE model, and θ_lFor parameter { W^l,b^l, l ∈ 1 ..., L}, loss function is defined as L (x, y)=(x-y)²。

Step S14: definition bound variable, order vector C=(c₁,c₂,...,c_M),Represent keyword y_iThe number of times occurred in training set P,Represent the average occurrence number of keyword. Like this, we can obtain a vector, represents the i-th width image x_iEach keyword Y_i ^j, j ∈ 1,2 ..., the number of times Y that M} occurs in training set_C,i=C*Y_i(* represents that two vectorial corresponding points are multiplied and obtains a new vector).Thus obtain at image x_iThe keyword that middle occurrence number is minimum is

If step S15: definition Φ (x) function, allows model be judged by learning sample in the training process, sample x, is also input picture x, and this sample more than k, is then added suitable noise by the number comprising low frequency tags. Definition Γ (x) function, increases training strength to sample x, if this sample is comprised the occurrence number of label lower than certain threshold value, then increases its training number of times, and in the present embodiment, this threshold value is generally got

Π = \frac{1}{M} Σ_{j = 1}^{M} c_{j} .

Γ (x_{i}) = \{\begin{matrix} α \cdot \frac{Π}{Λ_{x_{i}}} = α \cdot \frac{\frac{1}{M} Σ_{j = 1}^{M} c_{j}}{\arg \min_{j} (Y_{C, i}^{j})}, & Λ_{x_{i}} < = β \cdot Π \\ 1, & O t h e r s \end{matrix},

Wherein, α and β is constant coefficient, and β is used for determining which sample needs to increase the weight of training, and α is for controlling the training strength needing to increase the weight of the sample of training.

Φ (x_{i}) = \{\begin{matrix} χ \cdot (\frac{1}{d} Σ_{j = 1}^{d} x_{i}^{j}) \cdot R a n (\cdot), & Λ_{x_{i}} < = β \cdot Π \\ x_{i}, & O t h e r s \end{matrix},

Wherein, χ is constant coefficient, and for the intensity that control noises adds, d is image x_iThe dimension degree of feature,Represent image x_iThe value of jth dimension degree, Ran () is random number functions, and such as, the random function of Ran () desirable obedience (0,1) Gaussian distribution or value are the equally distributed random function of 0 to 1.

Step S16: adjusting and optimizing equation obtains B-SAE model,It is adjusted to

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} {\frac{1}{Γ (x_{i})} Σ_{j = 1}^{Γ (x_{i})} L (Φ (x_{i}), g_{θ^{'}} (f_{θ} (Φ (x_{i}))))},

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} L (F_{θ} (x_{i}), Y_{i})

It is adjusted to

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} Σ_{j = 1}^{Γ (x_{i})} L (F_{θ} (Φ (x_{i})), Y_{i}),

After model training is good, the output of last layer of B-SAE is the prediction distribution D of the keyword of predicted picture.

Further, in the present embodiment, step S2 realizes training EB-SAE model according to following step:

Step S21: the grouping sub-B-SAE model of training, B-SAE model is made an uproar the different group of model split by different adding, and divides submodel according to different hidden neuron numbers in each groupT represents that Model B-SAE adopts t kind to add the mode of making an uproar, and k represents the hidden neuron number that the sub-B-SAE model of kth is arranged.

Step S22: initial weight is set and calculates sub-B-SAE model classification specific inaccuracy, training data is arranged weights as follows:

W=(w₁₁,...,w_1i,...,w_1N),

w_{1 i} = \frac{1}{N}, i = 1, 2, ..., N,

Like this,Classification specific inaccuracy can calculate like this:

e_{k}^{t} = Σ_{i = 1}^{N} w_{t i} \cdot S g n (B_{-} {SAE}_{k}^{t} (x_{i}) &NotEqual; Y_{i}),

Wherein,

S g n (x) = \{\begin{matrix} 1, & x = t r u e \\ 0, & x = f a l s e \end{matrix}, B_{-} {SAE}_{k}^{t} (x_{i}) &NotEqual; Y_{i}

The meaning represented is, it is assumed that image x_iTrue tag collection Y_iComprise c keyword, and pass through modelPrediction obtains label collection Y_i ^*Number be also c, if Y_i=Y_i ^*, thenFor false, otherwise it is true.

Step S23: calculate B-SAE model weight and upgrade training data weights, according to all sons in groupThe classification specific inaccuracy of model, it is possible to obtain the Model B that this component class specific inaccuracy is minimum_-SAE^tWith the classification specific inaccuracy e of correspondence^t, B_-SAE^tWeight can calculate like this:After the model training of t group is complete, it is necessary to upgrade the weights of training data, better to obtain the weight of next group model, the mode upgrading training data weights is as follows:

W_t+1={ w_t+1,1,...,w_t+1,i,...,w_t+1,N},

w_{t + 1, i} = \frac{w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B_{-} {SAE}^{t} (x_{i}))}}{Σ_{i = 1}^{N} w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B_{-} {SAE}^{t} (x_{i}))}}, i = 1, 2, ..., N,

Step S24: the weighting sub-B-SAE model that adds up obtains EB-SAE model, after all having trained when all groups, so that it may to obtain keyword prediction distribution:

Being more than the better embodiment of the present invention, all changes done according to technical solution of the present invention, when the function produced does not exceed the scope of technical solution of the present invention, all belong to protection scope of the present invention.

Claims

1. the automatic image marking method based on enhancement type stack compiler, it is characterised in that, realize in accordance with the following steps:

2. a kind of automatic image marking method based on enhancement type stack compiler according to claim 1, it is characterised in that, in described step S1, also comprise the steps:

σ (x) = \frac{1}{(1 + e^{(- x)})}

For activation function;θ '=W ', b ' };

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} L (x_{i}, g_{θ^{'}} (f_{θ} (x_{i})));

Step S14: perform feed forward process and use Back Propagation Algorithm tuning; The feedforward operation of described stack compiler model is expressed as: h^l+1=σ (W^l+1h^l+b^l+1), l ∈ 0 ..., L-1}; The Back Propagation Algorithm tuning of described stack compiler model is expressed as:Wherein,It is the synthesis function of multiple compiler model, and θ_lFor parameter { W^l,b^l, l ∈ 1 ..., L}, loss function is L (x, y)=(x-y)²;

Λ_{x_{i}} = \arg \underset{j}{m i n} (Y_{C, i}^{j});

Step S16: definition Φ (x) function, learning sample is judged by described stack compiler model in the training process, if input picture x comprises the number of low frequency tags more than k, then this input picture x is added suitable noise; Definition Γ (x) function, increases training strength to input picture x, if this input picture x is comprised the occurrence number of label lower than predetermined threshold value, then increases training number of times, and wherein, function gamma (x) is:

Γ (x_{i}) = \{\begin{matrix} α \cdot \frac{Π}{Λ_{x_{i}}} = α \cdot \frac{\frac{1}{M} Σ_{j = 1}^{M} c_{j}}{\arg \min_{j} (Y_{C, i}^{j})}, & Λ_{x_{i}} < = β \cdot Π \\ 1, & O t h e r s \end{matrix},

Function phi (x) is:

Φ (x_{i}) = \{\begin{matrix} χ \cdot (\frac{1}{d} Σ_{j = 1}^{d} x_{i}^{j}) \cdot R a n (\cdot), & Λ_{x_{i}} < = β \cdot Π \\ x_{i}, & O t h e r s \end{matrix},

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} L (x_{i}, g_{θ^{'}} (f_{θ} (x_{i})))

It is adjusted to

θ^{*}, θ^{' *} = \underset{θ, θ^{'}}{\arg \min} \frac{1}{N} Σ_{i = 1}^{N} {\frac{1}{Γ (x_{i})} Σ_{j = 1}^{Γ (x_{i})} L (Φ (x_{i}), g_{θ^{'}} (f_{θ} (Φ (x_{i}))))},

Will

θ^{*} = \underset{θ}{\arg m i n} Σ_{i = 1}^{N} L (F_{θ} (x_{i}), Y_{i})

It is adjusted to

θ^{*} = \underset{θ}{\arg \min} Σ_{i = 1}^{N} Σ_{j = 1}^{Γ (x_{i})} L (F_{θ} (Φ (x_{i})), Y_{i}),

3. a kind of automatic image marking method based on enhancement type stack compiler according to claim 1, it is characterised in that, in described step S2, also comprise the steps:

W = (w_{11}, ..., w_{1 i}, ..., w_{1 N}), w_{1 i} = \frac{1}{N}, i = 1, 2, ..., N,

CalculateClassification specific inaccuracy:

e_{k}^{t} = Σ_{i = 1}^{N} w_{t i} \cdot S g n (B_{-} {SAE}_{k}^{t} (x_{i}) &NotEqual; Y_{i}),

Wherein,

S g n (x) = \{\begin{matrix} 1, & x = t r u e \\ 0, & x = f a l s e \end{matrix},

Step S23: calculated equilibrium stack compiler model weight, and upgrade training data weights; According to all sons in groupThe classification specific inaccuracy of model, it is possible to obtain the Model B-SAE that this component class specific inaccuracy is minimum^tAnd the classification specific inaccuracy e of correspondence^t, calculate B-SAE^tWeight:After the model training of t group is complete, upgrading the weights of training data, to obtain the weight of next group model, the mode upgrading training data weights is as follows:

W_{t + 1} = {w_{t + 1, 1}, ..., w_{t + 1, i}, ..., w_{t + 1, N}}, w_{t + 1, i} = \frac{w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B - {SAE}^{t} (x_{i}))}}{Σ_{i = 1}^{N} w_{t i} \cdot e^{(- α^{t} \cdot Y_{i} \cdot B - {SAE}^{t} (x_{i}))}}, i = 1, 2, ..., N,;

D = Σ_{t = 1}^{T} α^{t} \cdot B_{SAE}^{t} (x) .