CN106681305A

CN106681305A - Online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment

Info

Publication number: CN106681305A
Application number: CN201710000827.6A
Authority: CN
Inventors: 许玉格; 邓文凯; 陈立定
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2017-01-03
Filing date: 2017-01-03
Publication date: 2017-05-17

Abstract

The invention discloses an online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment. The method includes the steps of firstly, removing samples with incomplete attributes in sewage data, normalizing the samples into a [0, 1] interval, and determining a historical data set and an updating test set; secondly, using a relevance vector machine method based on clustering to compress the majority data of the historical data set; thirdly, using a virtual minority upward sampling method to extend the minority data of the historical data set; fourth, building a 'one-to-one' fast relevance vector machine multi-classification training model; fifthly, adding new samples from the updating test set into the model for testing, and updating the historical data set; sixthly, returning to the second step, reprocessing unbalanced historical data, training the model, and repeating the above process until online data testing is finished. By the online fault diagnosing method, the unbalance of the sewage data is lowered effectively, classification accuracy is increased, online updating speed is increased, operation faults can be diagnosed in real time, and the safety operation of a sewage treatment plant is guaranteed.

Description

A kind of Fast RVM sewage disposals on-line fault diagnosis method

Technical field

The present invention relates to sewage treatment area, more particularly to a kind of Fast RVM sewage disposals on-line fault diagnosis method.

Background technology

At present, environmental conservation has become the important foundation of China's sustainable economic development, as China's industrial economy is sent out Exhibition is rapid, and city process is constantly accelerated, the discharge capacity of the industrial wastewater rapid growth with the increase of industrial water consumption, most of The direct discharge of waste water again severe contamination rivers water bodys, destroy ecological balance, indirectly have impact on the life of people.Sewage Used as the crucial protective barrier of natural water, its operation is fine or not will to directly affect the safe coefficient of water environment for treatment plant.Sewage is given birth to Change complex treatment process, influence factor is very more, and sewage treatment plant is difficult to the fortune for maintaining a long-term stability in actual moving process OK, once operation troubles occurs can usually cause that effluent quality is up to standard, operating cost increases serious with secondary environmental pollution etc. Problem.Therefore, it is necessary to be monitored to sewage treatment plant's running status, sewage disposal process failure is diagnosed to be in time and is located Reason.

The fault diagnosis of sewage disposal process running status is substantially a pattern classification problem, and in virtual condition fortune In row classification, the skewness weighing apparatus problem of sewage data set can be usually run into, prior art has some limitations, is being used for When unbalanced data is classified, category of model accuracy cannot meet requirement, and to the fault diagnosis of biochemical wastewater treatment pole is brought Big difficulty；Simultaneously in the middle of real process, fault diagnosis is actually a continuous learning process, its spy for projecting Point study is not once to carry out offline, but what data were added one by one, the process being constantly optimized.On-line study side Method requires that before next data are obtained training must be completed, and can otherwise affect completing for next step decision-making, and at sewage The fault message that the operation of reason factory occurs is particularly important, so what online system failure diagnosis more focused on is rapidity and accurately Property.

The content of the invention

It is an object of the invention to overcome the deficiencies in the prior art, there is provided a kind of Fast clustered based on unbalanced data RVM sewage disposal on-line fault diagnosis methods, by the fast correlation vector machine method based on cluster to many several classes of data compressions With virtual minority class to the method for up-sampling to minority class data extending, the disequilibrium of sewage data is reduced, improve classification Accuracy rate, while many disaggregated models are set up to biochemical processing procedure of sewage using Fast RVM, accelerates online updating speed, so as to Ensure that the accurate forthright and real-time of the on-line fault diagnosis of sewage disposal process.

For achieving the above object, technical scheme provided by the present invention is：A kind of online failure of Fast RVM sewage disposals Diagnostic method, comprises the following steps：

S1. the incomplete sample of attribute in sewage data is weeded out, due to the difference of each input variable dimension, it is carried out Normalized, in normalizing to [0,1] interval, and determines history data set x_oldWith renewal test set x_new；

S2. many several classes of samples in historical data are compressed using the fast correlation vector machine method for being based on cluster；

S3. according to virtual minority class the minority class sample in historical data is expanded to the method for up-sampling；

S4. the sample data of all classes in the historical data after process is reconfigured and constitutes new history training set, and Set up many classification based training models of fast correlation vector machine of " one-to-one "；

S5. from renewal test set x_newK new samples of middle addition are tested in model, and preserve class test result, Historical data concentration is added to, removes k sample before historical data is concentrated；

S6. step S2 is returned to, unbalanced historical data is processed again, training pattern, continuous repeatedly said process, until Online updating data test is finished, and obtains final on-line testing result, so as to realize the on-line operation shape to sewage disposal process The identification of state.

Described step S2, specially：

S201, many several classes of sample set X={ x of hypothesis₁,x₂,…,x_i,…,x_nIt is n R^dThe data in space, wherein d are sample The dimension of this attribute, randomly chooses k object as initial cluster centre from n data object；

S202 then to remaining sample object then according to the distance of each cluster centre be separately dispensed into distance most phase In near cluster centre；The formula of computed range is as follows, it is assumed that c_jFor the center of j-th class, then x_iWith c_jDistance be：

S203, the point in set update the cluster centre of each class, it is assumed that the sample of j-th apoplexy due to endogenous wind isContain n_jIndividual sample, then such cluster centre beWherein For class center c_jM-th attribute, computing formula is as follows：

S204, constantly repeat S202, S203 step, till canonical measure function convergence, using mean square deviation as meter Canonical measure function is calculated, its form is：

S205, by cluster after many several classes of samples carry out fast correlation vector machine classification model construction, it is certain such that it is able to obtain The associated vector of quantity, the number of these associated vectors than original many several classes of data much less, and with certain representative Property, then replace original many several classes of samples so as to the compression to many several classes of samples with the associated vector chosen.

Described step S3, specially：

S301, to each sample x in minority class, with Euclidean distance as criterion calculation, it is every in minority class sample set The distance of individual sample, obtains wherein k arest neighbors, and records the subscript of neighbour's sample；

S302, according to up-sampling multiplying power N, to each minority class sample x, from its k arest neighbors N is randomly selected Individual sample, is designated as y₁,y₂,…,y_N；

S303, in former sample x and y_jStochastic linear interpolation is carried out between (j=1,2 ..., N), new minority class sample is constructed This p_j, i.e. new samples：

p_j=x+rand (0,1) * (y_j- x), j=1,2 ..., N (4)

Wherein rand (0,1) represents a random number in interval (0,1).

In step S4, many classification based training models of fast correlation vector machine of " one-to-one ", it is as follows that it sets up process：

Historical data after process and can be defined asWherein N is the sample of data set This number, n is sample sequence number, and d is the dimension of sample attribute, z_nFor the input of sample, t_nFor the desired value of sample, anticipation function As shown in formula one：

t_n=y (z_n；w)+ε_n (5)

The wherein definition of y (z) such as formula (shown in 2)

Wherein K (z, z_i) it is kernel function, w_iFor the corresponding weight of basic function, w=[w₀,w₁,…,w_N]^T,ε_nFor noise, clothes From ε_n~N (0, σ²), therefore t_n~N (y (z_n,w),σ²).Assume prediction target t_nBetween it is separate, then just have：

Φ is the structural matrix of a N × (N+1) in formula, in order to avoid over-fitting, the weights ω needed restraint in model, Assume its obedienceGauss distribution, α is hyper parameter.When one group of new variable is input into, corresponding desired value t* is p (t* | t)~p (w, α, σ²| t), it is distributed according to prior probability distribution and possibility predication, the Posterior probability distribution of weight can be obtained：

p(ω,α,σ²| t)=p (ω | t, α, σ²)p(α,σ²|t) (8)

Approximate processing is carried out to above formula, finally into maximization p (α, σ²|t)∝p(t|α,σ²)p(α)p(σ²) process, Namely find parameter alpha and σ²Most likely value α_MP、

Fast correlation vector machine starts dynamically to expand basic matrix Φ in the training process from empty set, so as to increase limit seemingly Right function, or remove the row of basic matrix Φ redundancies increasing object function.By by border likelihood function p (t | α, σ²) take it is right Number, and note L (α)=log [p (t | α, σ²)], arrangement has：

Wherein L (α_-i) be expressed as working as α_iDuring=∞, corresponding basis vector φ_iCorresponding border likelihood letter after being removed Several logarithms, and l (α_i) represent in the logarithmic function of border likelihood only and α_iRelevant independent sector.S_iBe defined as it is sparse because Son, Q_iFor quality factor.L (α) has unique maximum of points to be：

In order to maximize L (α), according to formula (10), constantly iteration to be searching out suitable weight, at this moment hyper parameter α Also can constantly update against weight w, by being continuously updated, final training pattern can be obtainedThe corresponding weight of some sample points is zero, and those points being not zero are exactly associated vector.It is comprehensive Upper described, fast correlation vector machine classification rudimentary algorithm step is as follows：

(1) σ is initialized²=0；

(2) with single base vector φ_iInitialization α_i, can be obtained by formula (10) analysis and arrangementAnd Others α is set_m(m ≠ i) is infinity；

(3) covariance matrix Σ, weight matrix μ are calculated and to all M basic function φ_mInitialization S_mAnd Q_m；

(4) from all M basic function φ_mThe base vector φ of candidate is selected in set_i；

(5) calculate

(6) if θ_i＞ 0 and α_i＜ ∞, reevaluate α_i；

(7) if θ_i＞ 0 and α_i=∞, adds φ_iTo in model and reevaluate α_i；

(8) if θ_i≤ 0 and α_i＜ ∞, delete φ_iAnd α is set_i=∞；

(9) covariance matrix Σ is recalculated with Laplace approach methods, in weight matrix μ and corresponding iterative process S_mAnd Q_m；

(10) if restraining or reaching maximum iteration time, terminator；Otherwise go to step (4)；End condition is：Appoint The meaning corresponding α of basic function in a model_i, there is α_i＜ le12 and

Set up fast correlation vector machine and after disaggregated model, then multiple two graders are adopted into " one-to-one " method phase With reference to, a multi-categorizer is set up, if sample to be sorted is k classification, any two class of this k apoplexy due to endogenous wind can constitute a base This grader of fast correlation vector machine two, pairwise classification is carried out to all of training sample, and such k classification is between any two altogether Meter may be constructedThe individual grader of fast correlation vector machine two, each fast correlation vector machine classifier is only respective It is trained on corresponding sample set.It is using the method for ballot, each is to be measured when carrying out class test to unknown sample Sample is all through allIndividual grader is differentiated.For example, when sample is classified between the class of i, j two, machine differentiates knot Really it belongs to the i-th class, just increases by 1 ticket in the i-th class, otherwise Jia 1 to the ballot of jth class, until the classification of all of grader is completed, Finally count who gets the most votes's class and be test sample generic.

If classification function f_ijX () is used for differentiating the class sample of i, j two, if f_ijX () ＜ 0, then differentiate that x belongs to the i-th class, remembers i classes 1 ticket, otherwise sentence x and belong to jth class, note j classes obtain 1 ticket, during last decision-making, compare ticket which kind of obtains at most, then will test Sample is planned to such.

The present invention compared with prior art, has the advantage that and beneficial effect：

1st, the present invention establishes a kind of on-line fault diagnosis of the Fast RVM sewage disposals clustered based on unbalanced data Model, by the fast correlation vector machine method based on cluster to many several classes of data compressions and virtual minority class to top sampling method To minority class data extending, the disequilibrium of sewage data is reduced, while using Fast RVM to biochemical processing procedure of sewage Many disaggregated models are set up, accelerates online updating speed, then real-time diagnosis and more new model are carried out according to operating mode interpolation data, etc. Fault diagnosis next time is treated, so as to establish on-line fault diagnosis model.The on-time model is improve to biochemical wastewater treatment system The fault diagnosis precision of system, on-line performance is good, effect is significant.

2nd, model of the invention is to many several classes of data compressions and virtual minority class based on the fast correlation vector machine for clustering To top sampling method to minority class data extending, the disequilibrium of sewage data is reduced, can not only be obtained in equilibrium criterion Good result, but also reasonable classifying quality can be obtained in unbalanced data, Fast RVM are employed on this basis The multi-categorizer of foundation, its key point is that its hyper parameter to training sample carries out Fast estimation, removes the non-of training sample Associated vector, it is ensured that model it is openness, so as to reduce the training time.Therefore, one kind that the present invention is adopted is based on uneven number On-line fault diagnosis modeling is carried out to sewage disposal process according to the on-line fault diagnosis method of the Fast RVM sewage disposals of cluster, Ensure that the accurate forthright and real-time of the on-line fault diagnosis of sewage disposal process.

3rd, when in-circuit emulation of the present invention is tested, need the data new to each group to be tested and added model to carry out more Newly.History data set taking restricted memory by way of keeping its capacity, make training data all the time be limited group, often increase As soon as the newest observation data of group, abandon immediately one group of earliest observation data, so as to ensure model in all comprising new data Information, it is to avoid data message contained by history floods the information that new data is included.

Description of the drawings

Fig. 1 is Fast RVM sewage disposal on-line fault diagnosis method stream of the model of the present invention based on unbalanced data cluster Cheng Tu.

Fig. 2 is model fast correlation vector machine sorting algorithm flow chart of the present invention.

Fig. 3 is many disaggregated model schematic diagrams of fast correlation vector machine of model of the present invention " one-to-one ".

Specific embodiment

With reference to specific embodiment, the present invention is described in further detail.

As shown in figure 1, the Fast RVM sewage disposal on-line fault diagnosis methods that the present invention is provided, based on unbalanced data Cluster, concrete condition is as follows：

Described step S2, specially：

Described step S3, specially：

S301, to each sample x in minority class, with Euclidean distance as criterion calculation, it is every in minority class sample set The distance of individual sample, wherein k arest neighbors of acquisition, and the subscript of neighbour's sample is recorded, here k takes 5；

p_j=x+rand (0,1) * (y_j- x), j=1,2 ..., N (14)

Wherein rand (0,1) represents a random number in interval (0,1).

In step S4, many classification based training models of fast correlation vector machine of " one-to-one ", as shown in figure 3, it sets up process It is as follows：

t_n=y (z_n；w)+ε_n (15)

The wherein definition of y (z) such as formula (shown in 2)

Wherein K (z, z_i) it is kernel function, w_iFor the corresponding weight of basic function, w=[w₀,w₁,…,w_N]^T,ε_nFor noise, clothes From ε_n~N (0, σ²), therefore t_n~N (y (z_n,w),σ²).Assume anticipation function t_nBetween it is separate, then just have：

p(ω,α,σ²| t)=p (ω | t, α, σ²)p(α,σ²|t) (18)

In order to maximize L (α), according to formula (20), constantly iteration to be searching out suitable weight, at this moment hyper parameter α Also can constantly update against weight w, by being continuously updated, final training pattern can be obtainedThe corresponding weight of some sample points is zero, and those points being not zero are exactly associated vector.Such as Shown in Fig. 2, fast correlation vector machine classification rudimentary algorithm step is as follows：

(1) σ is initialized²=0；

(2) with single base vector φ_iInitialization α_i, can be obtained by formula (20) analysis and arrangementAnd Others α is set_m(m ≠ i) is infinity；

(5) calculate

(6) if θ_i＞ 0 and α_i＜ ∞, reevaluate α_i；

(7) if θ_i＞ 0 and α_i=∞, adds φ_iTo in model and reevaluate α_i；

(8) if θ_i≤ 0 and α_i＜ ∞, delete φ_iAnd α is set_i=∞；

Below we combine the concrete data weighting extreme learning machine sewage disposal on-line fault diagnosis above-mentioned to the present invention Method is specifically described, as follows：

The data of experiment simulation, from UCI data bases, are the daily monitoring datas in two years of a sewage treatment plant, whole Individual data set has 527 records including including imperfect record one, each sample dimension for 38 (i.e. 38 measurands, it is right Each is answered to refer to target value), all complete record of whole property values has 380, and monitored water body one has 13 kinds of states, each State numeral replaces (saving state for convenience to claim).527 distribution situations recorded under 13 kinds of states see the table below 1.

Distribution situation of the 1-527 record of table under 13 kinds of states

Classification	1	2	3	4	5	6	7	8	9	10	11	12	13
														Number	279	1	1	4	116	3	1	1	65	1	53	1	1

In order to simplify the complexity of classification, sample is divided into 4 big class, such as table 2 below by us according to the property of sample class.

Distribution situation of the 2-527 record of table under 4 kinds of states

Classification	1	2	3	4
					Number	332	116	65	14

Classification 1 is normal condition, and classification 2 is the normal condition that performance exceedes meansigma methodss, and classification 3 is that flow of inlet water is low just Reason condition, classification 4 is the failure that the reasons such as the abnormal condition that second pond failure, heavy rain cause and solid solubility overload cause Situation.

The on-line fault diagnosis method of the above-mentioned Fast RVM sewage disposals clustered based on unbalanced data of the present embodiment, The step of comprising following order：

S1. the incomplete data of 147 attributes are weeded out in 527 sewage data first, are obtained 380 attributes and are completely counted According to then by data by formulaNormalized, by the data set after process 2 are pressed:1 ratio is carried out Optimum allocation random stratified sampling survey, obtains history data set x_oldWith online updating test set x_new。

S2. many several classes of samples (first kind) that historical data is concentrated are extracted, is polymerized to using K-means methods Two classes, are then modeled the primary sources after cluster using fast correlation vector machine method, obtain appropriate number of phase Vector is closed, many several classes of samples are replaced with selected associated vector；

S3. according to the multiplying power to up-sampling, using method from virtual minority class to up-sampling by the minority in historical sample Class sample (the 3rd class and the 4th class) is expanded；

S4. by process after the historical sample data of all classes reconfigure and constitute new history training set, such as the institute of table 3 Show, set up many classification based training models of fast correlation vector machine of " one-to-one ".Many classification based training model selection RBF kernel functions, core Width parameter by being determined using the trellis search method of 5 folding cross validations to new training set, then according to a total of four Individual classification, sets up altogether 6 two graders；

S5. from online updating test set x_newIn take k new samples and tested in multi-categorizer model, 6 are classified Device distinguishes input test collection x_new, voted, class test result is preserved, historical data concentration is added to, remove history Front k sample in data set；

S6. step S2 is returned to, re -training model, continuous repeatedly said process, until online updating data test is finished, Final on-line testing result is obtained, so as to realize the identification of the on-line operation state to sewage disposal process.The present invention is adopted Based on cluster Fast RVM sewage disposal on-line fault diagnosis models can be good at meet require, so as to realize to sewage The real-time monitoring of processing procedure running status and control, are worthy to be popularized.

Distribution situation of the 2-527 record of table under 4 kinds of states

The examples of implementation of the above are only the preferred embodiments of the invention, not limit the enforcement model of the present invention with this Enclose, therefore the change that all shapes according to the present invention, principle are made, all should cover within the scope of the present invention.

Claims

1. a kind of Fast RVM sewage disposals on-line fault diagnosis method, it is characterised in that comprise the following steps：

S1. the incomplete sample of attribute in sewage data is weeded out, due to the difference of each input variable dimension, normalizing is carried out to it Change is processed, and in normalizing to [0,1] interval, and determines history data set x_oldWith renewal test set x_new；

S4. the sample data of all classes in the historical data after process is reconfigured and constitutes new history training set, and set up The many classification based training models of fast correlation vector machine of " one-to-one "；

S5. from renewal test set x_newK new samples of middle addition are tested in model, and preserve class test result, by it It is added to historical data concentration, removes k sample before historical data is concentrated；

S6. step S2 is returned to, unbalanced historical data is processed again, training pattern, continuous repeatedly said process, until online Update the data and be completed, obtain final on-line testing result, so as to realize the on-line operation state to sewage disposal process Identification.

2. a kind of Fast RVM sewage disposals on-line fault diagnosis method according to claim 1, it is characterised in that described The step of S2, specially：

S201, many several classes of sample set X={ x of hypothesis₁,x₂,…,x_i,…,x_nIt is n R^dThe data in space, wherein d are sample category Property dimension, from n data object randomly choose k object as initial cluster centre；

S202 then to remaining sample object then according to the distance of each cluster centre be separately dispensed into distance it is most close In cluster centre；The formula of computed range is as follows, it is assumed that c_jFor the center of j-th class, then x_iWith c_jDistance be：

d (x_{i}, c_{j}) = \sqrt{{(x_{i 1} - c_{j 1})}^{2} + ... + {(x_{i m} - c_{j m})}^{2} + ... + {(x_{i d} - c_{j d})}^{2}} - - - (1)

c_{j}^{m} = \frac{x_{j 1}^{m} + x_{j 2}^{m} + ... + x_{{jm}_{j}}^{m}}{n_{j}} - - - (2)

S204, continuous repeat step S202, S203, till canonical measure function convergence, are marked using mean square deviation as calculating Quasi- measure function, its form is：

J = \sqrt{\frac{Σ_{j = 1}^{k} Σ_{q = 1}^{n_{j}} {(d (x_{j q} - c_{j}))}^{2}}{n - 1}} - - - (3)

S205, by cluster after many several classes of samples carry out fast correlation vector machine classification model construction, so as to obtain setting quantity phase Vector is closed, the number of these associated vectors, with certain representativeness, is then used than original many several classes of data much less The associated vector of selection replaces original many several classes of samples so as to the compression to many several classes of samples.

3. a kind of Fast RVM sewage disposals on-line fault diagnosis method according to claim 1, it is characterised in that described The step of S3, specially：

S301, to each sample x in minority class, with Euclidean distance as criterion calculation it to each sample in minority class sample set This distance, obtains wherein k arest neighbors, and records the subscript of neighbour's sample；

S302, according to up-sampling multiplying power N, to each minority class sample x, from its k arest neighbors N number of sample is randomly selected This, is designated as y₁,y₂,…,y_N；

S303, in former sample x and y_jStochastic linear interpolation is carried out between (j=1,2 ..., N), new minority class sample p is constructed_j, That is new samples：

p_j=x+rand (0,1) * (y_j- x), j=1,2 ..., N (4)

Wherein rand (0,1) represents a random number in interval (0,1).

4. a kind of Fast RVM sewage disposals on-line fault diagnosis method according to claim 1, it is characterised in that in step In rapid S4, many classification based training models of fast correlation vector machine of " one-to-one ", it is as follows that it sets up process：

Historical data after process is defined asz_n∈R^d,t_n∈ R, wherein N are the number of samples of data set, and n is sample Sequence number, d is the dimension of sample attribute, z_nFor the input of sample, t_nFor the desired value of sample, anticipation function is as shown in formula one：

t_n=y (z_n；w)+ε_n (5)

Wherein y (z) is defined as follows formula

y (z; w) = Σ_{i = 0}^{N} ω_{i} K (z, z_{i}) + w_{0} - - - (6)

Wherein K (z, z_i) it is kernel function, w_iFor the corresponding weight of basic function, w=[w₀,w₁,…,w_N]^T,ε_nFor noise, ε is obeyed_n~ N(0,σ²), therefore t_n~N (y (z_n,w),σ²), it is assumed that prediction target t_nBetween it is separate, then just have：

p (t | σ^{2}, ω) = Π_{i = 1}^{N} N (t_{i} | y (z_{i}, ω), σ^{2}) = {(2 {πσ}^{2})}^{- \frac{N}{2}} \exp (- \frac{| | t - Φ ω | |}{2 σ^{2}}) - - - (7)

Φ is the structural matrix of a N × (N+1) in formula, in order to avoid over-fitting, the weights ω needed restraint in model, it is assumed that Its obedienceGauss distribution, α is hyper parameter, when be input into one group of new variable when, corresponding desired value t* be p (t* | T)~p (w, α, σ²| t), it is distributed according to prior probability distribution and possibility predication, obtains the Posterior probability distribution of weight：

p(ω,α,σ²| t)=p (ω | t, α, σ²)p(α,σ²|t) (8)

Approximate processing is carried out to above formula, finally into maximization p (α, σ²|t)∝p(t|α,σ²)p(α)p(σ²) process, also It is to find parameter alpha and σ²Most likely value α_MP、

Fast correlation vector machine starts dynamically to expand basic matrix Φ in the training process from empty set, so as to increase marginal likelihood letter Number, or remove the row of basic matrix Φ redundancies increasing object function；By by border likelihood function p (t | α, σ²) take the logarithm, Note L (α)=log [p (t | α, σ²)], arrangement has：

L (α) = L (α_{- i}) + \frac{1}{2} [{logα}_{i} - l o g (α_{i} + S_{i}) + \frac{{(Q_{i})}^{2}}{α_{i} + S_{i}}] = L (α_{- i}) + l (α_{i}) - - - (9)

Wherein L (α_-i) be expressed as working as α_iDuring=∞, corresponding basis vector φ_iCorresponding border likelihood function after being removed Logarithm, and l (α_i) represent in the logarithmic function of border likelihood only and α_iRelevant independent sector.S_iIt is defined as the sparse factor, Q_i For quality factor.L (α) has unique maximum of points to be：

α_{i} = \{\begin{matrix} \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}} & Q_{i}^{2} > S_{i} \\ \infty & Q_{i}^{2} \leq S_{i} \end{matrix} - - - (10)

In order to maximize L (α), according to formula (10), constantly iteration searching out suitable weight, at this moment also can by hyper parameter α Constantly update against weight w, by being continuously updated, obtain final training patternSome The corresponding weight of sample point is zero, and those points being not zero are exactly associated vector；Fast correlation vector machine classification rudimentary algorithm step It is rapid as follows：

(1) σ is initialized²=0；

(2) with single base vector φ_iInitialization α_i, obtained by formula (10) analysis and arrangementAnd other are set α_m(m ≠ i) is infinity；

(5) calculate

(6) if θ_i＞ 0 and α_i＜ ∞, reevaluate α_i；

(7) if θ_i＞ 0 and α_i=∞, adds φ_iTo in model and reevaluate α_i；

(8) if θ_i≤ 0 and α_i＜ ∞, delete φ_iAnd α is set_i=∞；

(9) covariance matrix Σ, the S in weight matrix μ and corresponding iterative process are recalculated with Laplace approach methods_mWith Q_m；

(10) if restraining or reaching maximum iteration time, terminator；Otherwise go to step (4)；End condition is：Arbitrarily exist The corresponding α of basic function in model_i, there is α_i＜ le12 and

Set up basic fast correlation vector machine and after disaggregated model, then multiple two graders are tied using " one-to-one " method Altogether, so as to setting up a multi-categorizer.