CN104680015A

CN104680015A - Online soft measurement method for sewage treatment based on quick relevance vector machine

Info

Publication number: CN104680015A
Application number: CN201510093369.6A
Authority: CN
Inventors: 许玉格; 曹涛; 罗飞; 宋亚龄; 张雍涛
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2015-03-02
Filing date: 2015-03-02
Publication date: 2015-06-03

Abstract

The invention discloses an online soft measurement method for sewage treatment based on a quick relevance vector machine. The online soft measurement method sequentially comprises the following steps: estimating hyper-parameters by a quick marginal likelihood algorithm to obtain a weight value and a sample deviation value of a model; then establishing an online predication model of the quick relevance vector machine, optimizing model parameters and realizing accurate and quick measurement for BOD (Biological Oxygen Demand) in sewage. According to the measurement method disclosed by the invention, the real-time requirement can be met; an optimal predication model is established, so the prediction precision is improved, and the effect is obvious and the performances are improved; an online soft measurement model for sewage quality, established by the quick relevance vector machine, is high in prediction precision, high in generalization ability and short in updating time, and has important significance for reducing operating expenses of a sewage treatment plant, reflecting the condition of the sewage quality and realizing an automatic control system for the sewage treatment.

Description

A kind of wastewater treatment online soft sensor method based on fast correlation vector machine

Technical field

The present invention relates to sewage treatment area, particularly a kind of wastewater treatment online soft sensor method based on fast correlation vector machine.

Background technology

Wastewater treatment is the indispensable ingredient of economic development and fwaater resources protection.Along with the quick growth of national economy, quantity of wastewater effluent also increases greatly, and sewage treatment plant very little, and treatment cycle is oversize, does not reach the requirement of Environmental protection far away.The input of Environmental protection simultaneously strengthens, and sewage disposal technology more and more receives more concern.Clearly propose in national development planning to research and develop and promote low energy consumption, effective sewage disposal technology.

In sewage drainage standard, weigh whether parameter index up to standard has: chemical oxygen demand COD, biochemical oxygen demand BOD, ammonia nitrogen, phosphorus, solid suspension etc.Wherein biochemical oxygen demand BOD and chemical oxygen demand COD reflection water are by the program of organic contamination, and the ratio of BOD/COD has reflected the biodegrability of sewage.The measurement of these two parameters has very important value to control wastewater treatment.Chemical oxygen demand COD refers to, water sample under certain condition, to be oxidized the amount of the oxygenant that reducing substances consumes in 1 premium on currency sample for index, be converted to every premium on currency sample all oxidized after, the milligram number of the oxygen of needs, represents with mg/L.Biochemical oxygen demand BOD refers to and the dissolved oxygen content that microorganism is decomposed oxidation of organic compounds and consumes under certain temperature and time condition represents with mg/L.

Present wastewater treatment generally all adopts dilution method, sensor to measure the concentration of BOD, COD in sewage, but it is longer owing to analyzing the cycle measuring this two indices, often error is there is in measurement, the field condition of wastewater treatment can not be reacted in time, thus effluent control system also exists larger time delay, can not play the performance of its best.The present invention proposes the flexible measurement method of a kind of measurement BOD newly, improved by the training process of quick marginal likelihood algorithm to Method Using Relevance Vector Machine, can hyper parameter be made quickly to reach stationary value, thus obtain weighted value and deviate, and build online fast correlation vector machine soft-sensing model, realize the measurement to wastewater treatment water outlet BOD.

Summary of the invention

The object of the invention is to overcome the shortcoming of prior art and deficiency, a kind of wastewater treatment online soft sensor method based on fast correlation vector machine is provided.

Object of the present invention is realized by following technical scheme:

Based on a wastewater treatment online soft sensor method for fast correlation vector machine, comprise the step of following order:

A, estimate hyper parameter by quick marginal likelihood algorithm, obtain weighted value and the sample bias value of model;

B, then set up fast correlation vector machine on-line prediction model, to model parameter optimizing, achieve the accurate Quick Measurement of BOD in sewage.

The described wastewater treatment online soft sensor method based on fast correlation vector machine, specifically comprises following steps:

S1. reject the abnormity point in the data of input and output, due to the difference of each input variable dimension, it is normalized, normalize in [0,1] interval;

S2. given sewage data set { (x _n, t _n), n=1,2 ..., N}, x _n∈ R ^d, t _n∈ R, N are sample numbers, for simplicity, only consider scalar objective function, and we follow the new probability formula of standard, assuming that:

t _n＝y(x _n；w)+ε _n(1)

Wherein, y () is nonlinear function, ε _nbe average be 0, variance is σ ²gaussian noise, namely therefore t is had _n~ N (y (x _n), σ ²), this shows t _nmeet average y (x _n) for variance be σ ²gaussian noise distribution, similar with support vector machine, function y (x) is defined as

y (x; w) = Σ_{i = 1}^{N} w_{i} K (x, x_{i}) + w_{0} - - - (2)

Wherein, by formula φ _i(x)=K (x, x _i) determine basis function, its core is by training vector parametrization, assuming that t _nbe separate, then the likelihood function of whole training set can be written as

p (t | w, σ^{2}) = {({2 πσ}^{2})}^{- N / 2} \exp (- \frac{{| | t - Φw | |}^{2}}{{2 σ}^{2}}) - - - (3)

T=[t in formula ₁, t ₂..., t _n] ^t, w=[w ₀, w ₁..., w _m] ^t, Φ is the design matrix of a N × (N+1), Φ=[φ ₁, φ ₂..., φ _m] be group nonlinear basis functions, φ (x _n)=[1, K (x _n, x ₁), K (x _n, x ₂) ..., K (x _n, x _n)] ^t;

Owing to there being the much the same number of parameters with training sample in a model, w and σ obtained from (3) formula ²maximum likelihood estimator likely cause model over-fitting; In order to avoid overfitting, common way forces some restrictive conditions to parameter; Here, we are from Bayesian probability framework, carry out limiting parameter w and σ by defining a prior probability distribution ²;

Here select the function that is smoother, the prior probability distribution of definition w is the Gaussian distribution of zero-mean:

p (w | α) = Π_{j = 0}^{N} N (w_{j} | 0, α_{j}^{- 1}) - - - (4)

In formula (4): hyper parameter α=[α ₀, α ₁..., α _n] ^t, the more important thing is, each independently hyper parameter α _jonly corresponding with it weight w _jrelevant; By this restrictive condition, after the study of a large amount of sewage data, most of hyper parameter can level off to infinity, and the weights corresponding with it are 0, thus it is higher openness that RVM is had;

Define prior probability now, from Bayes rule, for unknown data in given data, Bayesian inference is by calculating posterior probability process:

p (w, α, σ^{2} | t) = \frac{p (t | w, α, σ^{2}) p (w, α, σ^{2})}{p (t)} - - - (5)

A given test point x _*, corresponding sewage effluent quality predicted value t _*prediction distribution be

p(t _*|t)＝∫p(t _*|w,α,σ ²)p(w,α,σ ²|t)dwdαdσ ²(6)

According to Bayesian formula, the Posterior distrbutionp utilizing sample likelihood function (4) and w prior distribution (5) can obtain w is

p (w | tα, β) = \frac{p (w | α) p (t | w, β)}{p (t | α, β)} - - - (7)

We are decomposed into posterior probability

p(w,α,σ ²|t)＝p(|w|t,α,σ ²)p(α,σ ²|t) (8)

Therefore to the Posterior probability distribution of weight be

p (w | t, α, σ^{2}) = \frac{p (t | w, σ^{2}) p (w | α)}{p (t | α, σ^{2})} = {(2 π)}^{- (N + 1) / 2} {| Σ |}^{- 1 / 2} \exp {- \frac{1}{2} {(w - μ)}^{T} Σ^{- 1} (w - μ)} - - - (9)

Its covariance is ∑=(σ ^-2Φ ^tΦ+A) ^-1(10)

Mean value is μ=σ ^-2∑ Φ ^tt (11)

Wherein matrix A=diag (α ₀, α ₁..., α _n)

{(α_{i})}^{new} = \frac{γ_{i}}{μ_{i}^{2}} - - - (12)

{(σ^{2})}^{new} = \frac{{| | t - Φμ | |}^{2}}{N - Σ_{i} γ_{i}} - - - (13)

Wherein γ _i≡ 1-α _i∑ _ii, ∑ _iifor i-th diagonal element of covariance matrix ∑; Hyper parameter α and variances sigma is obtained to the iteration reasoning computing of formula (13) finally by formula (10) ²estimated value; The sewage quality predicted value exported is y _*=μ ^tφ (x _*), x _*it is sewage disposal process input value;

S3. hyper parameter is estimated with quick marginal likelihood algorithm

The problem large for Method Using Relevance Vector Machine complexity computing time, memory cost is large, have employed a kind of quick marginal likelihood algorithm; It dynamically expands basis matrix Φ in the training process from empty set, thus increases marginal likelihood function, or the row removing basis matrix Φ redundancy are to increase objective function;

S4. sewage sample data to be predicted is predicted: will enter water number according to the input as the Method Using Relevance Vector Machine soft-sensing model trained, the output of model is predicting the outcome of water outlet BOD.

Described step S3, specifically comprises the step of following order:

Method Using Relevance Vector Machine is by maximizing marginal likelihood function p (t| α, σ ²) method determination hyper parameter α and variances sigma ², be equivalent to and maximumly turn to its logarithm; Note L (α)=log [p (t| α, σ ²)], arrangement has

\begin{matrix} L (α) = \log [p (t | α, σ^{2})] = \log [&Integral; p (t | w, σ^{2}) p (w | α)] \\ = - \frac{1}{2} [N \log 2 π + \log | C | + t^{T} C^{- 1} t] \end{matrix} - - - (14)

Wherein C=σ ²i+ Φ A ^-1Φ ^t, t=[t1, t2 ..., t _n] ^t;

For the ease of maximizing L (α), equivalence transformation is carried out to Matrix C, as follows:

\begin{matrix} C = σ^{2} I + {ΦA}^{- 1} Φ^{T} = σ^{2} I + \underset{m &NotEqual; i}{Σ} α_{m}^{- 1} φ_{m} φ_{m}^{T} + α_{i}^{- 1} φ_{i} φ_{i}^{T} \\ = C_{- i} + α_{i}^{- 1} φ_{i} φ_{i}^{T} \end{matrix} - - - (15)

Wherein this matrix representation works as α _iduring=∞, corresponding base vector φ _ibe removed the covariance matrix that rear sample is corresponding, arrange can obtain according to matrix correlation character

\begin{matrix} | C | = | C_{- i} | | 1 + α_{i}^{- 1} φ_{i}^{T} C_{- i}^{- 1} φ_{i} | \\ C^{- 1} = C_{- i}^{- 1} - \frac{C_{- i}^{- 1} φ_{i} φ_{i}^{T} C_{- i}^{- 1}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}} \end{matrix} - - - (16)

Therefore formula (14) can be rewritten as

\begin{matrix} L (α) = - \frac{1}{2} [N \log 2 π + \log | C | + t^{T} C_{- i} t - \log α_{i} + \log (α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}) - \frac{{(φ_{i}^{T} C_{- i}^{- 1} t)}^{2}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}}] \\ = L (α_{- i}) + \frac{1}{2} [\log α_{i} - \log (α_{i} + s_{i}) + \frac{{(q_{i})}^{2}}{α_{i} + s_{i}} \\ = L (α_{- i}) + l (α_{i}) \end{matrix} - - - (17)

Note L (α _-i) be expressed as and work as α _iduring=∞, corresponding basis vector φ _ibe removed the logarithm of rear corresponding border likelihood function, and l (α _i) to represent in the logarithmic function of border likelihood only and α _irelevant independent sector, above formula is to α _ilocal derviation is asked to have

\frac{&PartialD; L (α)}{{&PartialD; α}_{i}} = \frac{&PartialD; l (α_{i})}{{&PartialD; α}_{i}} = \frac{1}{2} [\frac{1}{α_{i}} - \frac{1}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}} - \frac{{(φ_{i}^{T} C_{- i}^{- 1} t)}^{2}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}}] - - - (18)

Note

S_{i} = φ_{i}^{T} C_{- i}^{- 1} φ_{i}, Q_{i} = φ_{i}^{T} C_{- i}^{- 1} t - - - (19)

Institute with the formula (18) can be rewritten as

\frac{&PartialD; L (α)}{{&PartialD; α}_{i}} = \frac{α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})}{2 {(α_{i} + S_{i})}^{2}} - - - (20)

Make formula (20) equal zero, consider α _ithat variance yields is just necessary for, so work as shi You

α_{i} = \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}} - - - (21)

To L (α) about α _isecond order local derviation is asked to have

\begin{matrix} \frac{{&PartialD;}^{2} L (α)}{{&PartialD; α}_{i}^{2}} = \frac{{- α}_{i}^{- 2} S_{i}^{2} {(α_{i} + S_{i})}^{2} - 2 (α_{i} + S_{i}) [α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})]}{2 {(α_{i} + S_{i})}^{4}} \\ = \frac{α_{i}^{- 2} S_{i}^{2}}{2 {(α_{i} + S_{i})}^{2}} - \frac{S_{i}^{2}}{α_{i} {(α_{i} + S_{i})}^{3}} - \frac{[α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})]}{{(α_{i} + S_{i})}^{3}} \end{matrix} - - - (22)

It is known that analysis is carried out in aggregative formula (20) and (21)

\frac{{&PartialD;}^{2} L (α)}{{&PartialD; α}_{i}^{2}} |_{α_{i} = \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}}} = \frac{{- S}_{i}^{2}}{{2 α}_{i}^{2} {(α_{i} + S_{i})}^{2}} - - - (23)

So work as time, the expression formula on formula (23) left side is permanent minus, and can obtain above derivation formula analysis, and L (α) has unique maximum of points to be

α_{i} = \{\begin{matrix} \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}} & Q_{i}^{2} > S_{i} \\ \infty & Q_{i}^{2} \leq S_{i} \end{matrix} - - - (24)

Obtain the maximal value of L (α) thus.

Described Bayes L (α) maximizes by the following method:

A, as base vector φ _iin a model, i.e. α _i< ∞, but have then by φ _idelete from model, even α _i=∞, can increase Bayes L (α) like this;

B, as base vector φ _iin a model, i.e. α _i=∞, but have then by φ _ito add in model and to utilize formula (24) to upgrade α _i, Bayes L (α) can be increased like this;

C, as base vector φ _iin a model, i.e. α _i< ∞, but have formula (24) is then used to upgrade α _i, Bayes L (α) can be increased like this.

Described fast correlation vector machine, it is as follows that it returns rudimentary algorithm step:

I, initialization σ ²;

II, with single base vector φ _iinitialization α _i, can be obtained by formula (24) analysis and arrangement and other α is set _m(m ≠ i) is infinitely great;

III, ∑, μ to all M basis function φ is calculated _minitialization S _mand Q _m;

IV, from all M basis function φ _mthe base vector φ of candidate is selected in set _i;

V, calculate

θ_{i} &equiv; Q_{i}^{2} - S_{i};

If VI θ _i>0 and α _i< ∞ (base vector φ _iin a model), α is reappraised _i;

If VII θ _i>0 and α _i=∞ (base vector φ _inot in a model), φ is added _ireappraise α in the model _i;

If VIII θ _i≤ 0 and α _i< ∞, deletes φ _iand α is set _i=∞;

Ⅸ, estimating noise variance wherein N is data amount check, and M is basis function number;

Ⅹ, covariance matrix ∑ is recalculated, the S in weight matrix μ and corresponding iterative process _mand Q _m;

If Ⅺ restrains or reach maximum iteration time, then preserve weighted value and deviate, this trains end; Otherwise go to step IV and continue training.

Compared with prior art, tool has the following advantages and beneficial effect in the present invention:

1, the present invention adopts a kind of sewage effluent quality online soft sensor model based on fast correlation vector machine.First fast correlation vector machine off-line model is set up, then according to operating mode real-time update model, information data important in reserving model, and use quick marginal likelihood algorithm to accelerate the pace of learning of model, meet the requirement of real-time, set up optimum prediction model, precision of prediction is improved, Be very effective, performance is improved.The sewage quality online soft sensor model prediction accuracy that fast correlation vector machine is set up is high, generalization ability is strong, update time is short, for saving sewage treatment plant running cost, real-time reflection sewage quality situation, has great importance to Water Treatment Automatic Control System.

2, the basic thought of hard measurement is that Theory of Automatic Control and production run knowledge are combined, Applied Computer Techniques, measure or temporary transient immeasurable significant variable (or being referred to as leading variable) for being difficult to, select the variable (or being referred to as auxiliary variable) that other is easily measured, by forming certain mathematical relation to infer and estimating, replace hardware (sensor) function with software.This method response rapidly, can provide leading variable information continuously, and have the advantages such as investment is low, maintaining is simple.Soft-measuring technique is used for sewage disposal process, the energy consumption of sewage treatment plant can be reduced, plant maintenance expense etc., and have comparatively positive interaction for the error that sensor measurement produces.But traditional off-line soft-sensing model once trains foundation according to a large amount of data, can not change afterwards again, so just may cause new data fitness so not strong.Online soft-sensing model is then first set up off-line model, then also makes corresponding changes according to the variation model of operating mode, and keep near-synchronous to upgrade with up-to-date data all the time, such precision of prediction have also been obtained corresponding raising.

Accompanying drawing explanation

Fig. 1 is model on-time model water outlet BOD result fitted figure of the present invention;

Fig. 2 is model off-line training weighted value distribution plan of the present invention;

Fig. 3 is model off-line model water outlet BOD result fitted figure of the present invention.

Embodiment

Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited thereto.

The present invention adopts fast correlation vector machine to set up off-line model; Utilize the training process of quick marginal likelihood algorithm improvement model, the on-line training speed of model can be accelerated, make hyper parameter reach stationary value quickly.Its algorithm steps is as follows

Algorithm steps:

S1. initialization σ ²;

S2. with single base vector φ _iinitialization α _i, can be obtained by formula (24) analysis and arrangement and other α is set _m(m ≠ i) is infinitely great;

S3. ∑, μ to all M basis function φ is calculated _minitialization S _mand Q _m;

S4. from all M basis function φ _mthe base vector φ of candidate is selected in set _i;

S5. calculate

θ_{i} &equiv; Q_{i}^{2} - S_{i};

If S6. θ _i>0 and α _i< ∞ (base vector φ _iin a model), α is reappraised _i;

If S7. θ _i>0 and α _i=∞ (base vector φ _inot in a model), φ is added _ireappraise α in the model _i;

If S8. θ _i≤ 0 and α _i< ∞, deletes φ _iand α is set _i=∞;

S9. estimating noise variance wherein N is data amount check, and M is basis function number;

S10. covariance matrix ∑ is recalculated, the S in weight matrix μ and corresponding iterative process _mand Q _m;

If S11. restrain or reach maximum iteration time, then preserve weighted value and deviate, this trains end; Otherwise turn S4 and continue training.

Sewage is data from the sewage data in University of California's database (UCI).BOD is that reflection water body is by the overall target of Organic Pollution degree.With suspended sediment concentration, the chemical oxygen demand (COD) of Inlet and outlet water, the BOD of water inlet, flow, the closely related temperature of the variablees such as pH value.Auxiliary variable needed for modeling is degradable solid concentration RD-SED-G, suspended sediment concentration RD-SS-G, biochemical oxygen demand RD-DBO-G, chemical oxygen demand (COD) RD-DQO-G, biochemical oxygen demand RD-DBO-P in preliminary sedimentation tank, suspended sediment concentration RD-SS-P, biochemical oxygen demand RD-DBO-S in second pond, chemical oxygen demand (COD) RD-DQO-S, enter the biochemical oxygen demand DBO in water, chemical oxygen demand (COD) DQO, chemical oxygen demand (COD) DQO in two stage treatment, biochemical oxygen demand DBO, suspended sediment concentration SS, pH value PH-S, degradable solid concentration SED, the chemical oxygen demand (COD) DQO-S of water outlet, degradable solid concentration SED-S, suspended sediment concentration SS-S, pH value PH-S.As from the foregoing, input attributes 19, output attribute 1.Choose 400 groups of data after process, wherein 200 groups are used for training pattern, and 200 groups are used as new data test model precision.

(2) RVM modeling procedure is as follows:

Sewage data set { (x _n, t _n), n=1,2 ..., N}, x _n∈ R ^d, t _n∈ R, N are sample numbers, assuming that:

t _n＝y(x _n；w)+ε _n(1)

Wherein, y () is nonlinear function, ε _nbe average be 0, variance is σ ²gaussian noise, namely therefore t is had _n~ N (y (x _n), σ ²), function y (x) is defined as

y (x; w) = Σ_{i = 1}^{N} w_{i} K (x, x_{i}) + w_{0} - - - (2)

In formula, determine basis function, its core is by training vector parametrization φ _i(x)=K (x, x _i). assuming that t _nbe separate, then the likelihood function of whole training set can be written as

p (t | w, σ^{2}) = {({2 πσ}^{2})}^{- N / 2} \exp (- \frac{{| | t - Φw | |}^{2}}{{2 σ}^{2}}) - - - (3)

T=[t in formula ₁, t ₂..., t _n] ^t, w=[w ₀, w ₁..., w _m] ^t, Φ is the design matrix of a N × (N+1), Φ=[φ ₁, φ ₂..., φ _m] be group nonlinear basis functions, φ (x _n)=[1, K (x _n, x ₁), K (x _n, x ₂) ..., K (x _n, x _n)] ^t.

Owing to there being the much the same number of parameters with training sample in a model, w and σ obtained from (3) formula ²maximum likelihood estimator likely cause model over-fitting. in order to avoid overfitting, common way forces some restrictive conditions to parameter. here, we are from Bayesian probability framework, limit these parameters by defining a prior probability distribution

p (w | α) = Π_{j = 0}^{N} N (w_{j} | 0, α_{j}^{- 1}) - - - (4)

In formula (4): hyper parameter α=[α ₀, α ₁..., α _n] ^t, the more important thing is, each independently hyper parameter α _jonly corresponding with it weight w _jrelevant. by this restrictive condition, after the study of a large amount of sewage data, most of hyper parameter can level off to infinity, and the weights corresponding with it are 0, thus it is higher openness that RVM is had.

From Bayes rule, for unknown data in given data, Bayesian inference passes through to calculate posterior probability process,

p (w, α, σ^{2} | t) = \frac{p (t | w, α, σ^{2}) p (w, α, σ^{2})}{p (t)} - - - (5)

p(t _*|t)＝∫p(t _*|w,α,σ ²)p(w,α,σ ²|t)dwdαdσ ²(6)

p (w | tα, β) = \frac{p (w | α) p (t | w, β)}{p (t | α, β)} - - - (7)

We are decomposed into posterior probability

p(w,α,σ ²|t)＝p(|w|t,α,σ ²)p(α,σ ²|t) (8)

Therefore to the Posterior probability distribution of weight be

p (w | t, α, σ^{2}) = \frac{p (t | w, σ^{2}) p (w | α)}{p (t | α, σ^{2})} = {(2 π)}^{- (N + 1) / 2} {| Σ |}^{- 1 / 2} \exp {- \frac{1}{2} {(w - μ)}^{T} Σ^{- 1} (w - μ)} - - - (9)

Its covariance is

∑＝(σ ^-2Φ ^TΦ+A) ^-1(10)

Mean value is

μ＝σ ^-2∑Φ ^Tt (11)

Wherein matrix A=diag (α ₀, α ₁..., α _n)

{(α_{i})}^{new} = \frac{γ_{i}}{μ_{i}^{2}} - - - (12)

{(σ^{2})}^{new} = \frac{{| | t - Φμ | |}^{2}}{N - Σ_{i} γ_{i}} - - - (13)

Wherein γ _i≡ 1-α _i∑ _ii, ∑ _iifor i-th diagonal element of covariance matrix ∑. the iteration reasoning computing finally by (10) to (13) formula obtains hyper parameter α and variances sigma ²estimated value.

The sewage quality predicted value exported is y _*=μ ^tφ (x _*), x _*it is sewage disposal process input value.Off-line model predicts the outcome as shown in Figure 3.Weights distribution as shown in Figure 2.

(3) suppose that up-to-date sewage input attributes is x _new, actual water outlet BOD value is y _new, then the online updating algorithm following steps of water outlet BOD:

S1. use the described fast correlation vector machine of upper joint to return rudimentary algorithm step according to historical data and set up initial model;

If S2. carried out new data, then to new sewage data x _newprediction is carried out and calculation deviation with formula (2); Otherwise, forward S8 to;

S3. new samples (x is added _new, y _new) in model, also initializes weights w is 0, α _newbe initialized as 1;

S5. estimating noise variance wherein N is data amount check, and M is basis function number;

S6. ∑ is recalculated, the S in μ and corresponding iterative process _mand Q _m, computing formula is respectively ∑=(σ ^-2Φ ^tΦ+A) ^-1, μ=σ ^-2∑ Φ ^tt,

If S7. restrain or reach maximum iteration time, then preserve weighted value and deviation, turn S2 and continue predicted data; Otherwise turn S4.

S8. EOP (end of program).

On-time model actual motion predicts the outcome as shown in Figure 1.

Above-described embodiment is the present invention's preferably embodiment; but embodiments of the present invention are not restricted to the described embodiments; change, the modification done under other any does not deviate from Spirit Essence of the present invention and principle, substitute, combine, simplify; all should be the substitute mode of equivalence, be included within protection scope of the present invention.

Claims

1., based on a wastewater treatment online soft sensor method for fast correlation vector machine, it is characterized in that, comprise the step of following order:

2. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 1, is characterized in that, specifically comprise following steps:

t _n＝y(x _n；w)+ε _n(1)

y (x; w) = Σ_{i = 1}^{N} w_{i} K (x, x_{i}) + w_{0} - - - (2)

p (t | w, σ^{2}) = {(2 π σ^{2})}^{- N / 2} \exp (- \frac{{| | t - Φw | |}^{2}}{2 σ^{2}}) - - - (3)

p (w | α) = Π_{j = 0}^{N} N (w_{j} | 0, α_{j}^{- 1}) - - - (4)

p (w, α, σ^{2} | t) = \frac{p (t | w, α, σ^{2}) p (w, α, σ^{2})}{p (t)} - - - (5)

p(t _*|t)＝∫p(t _*|w,α,σ ²)p(w,α,σ ²|t)dwdαdσ ²(6)

p (w | tα, β) = \frac{p (w | α) p (t | w, β)}{p (t | α, β)} - - - (7)

We are decomposed into posterior probability

p(w,α,σ ²|t)＝p(|w|t,α,σ ²)p(α,σ ²|t) (8)

Therefore to the Posterior probability distribution of weight be

p (w | t, α, σ^{2}) = \frac{p (t | w, σ^{2}) p (w | α)}{p (t | α, σ^{2})} = {(2 π)}^{- (N + 1) / 2} {| Σ |}^{- 1 / 2} \exp {- \frac{1}{2} {(w - μ)}^{T} Σ^{- 1} (w - μ)} - - - (9)

Its covariance is Σ=(σ ^-2Φ ^tΦ+A) ^-1(10)

Mean value is μ=σ ^-2Σ Φ ^tt (11)

Wherein matrix A=diag (α ₀, α ₁..., α _n)

{(α_{i})}^{new} = \frac{γ_{i}}{μ_{i}^{2}} - - - (12)

{(σ^{2})}^{new} = \frac{{| | t - Φμ | |}^{2}}{N - Σ_{i} γ_{i}} - - - (13)

Wherein γ _i≡ 1-α _iΣ _ii, Σ _iifor i-th diagonal element of covariance matrix Σ; Hyper parameter α and variances sigma is obtained to the iteration reasoning computing of formula (13) finally by formula (10) ²estimated value; The sewage quality predicted value exported is y _*=μ ^tφ (x _*), x _*it is sewage disposal process input value;

S3. hyper parameter is estimated with quick marginal likelihood algorithm

3. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 2, is characterized in that described step S3 specifically comprises the step of following order:

\begin{matrix} L (α) = \log [p (t | α, σ^{2})] = \log [&Integral; p (t | w, σ^{2}) p (w | α)] \\ = - \frac{1}{2} [N \log 2 π + \log | C | + t^{T} C^{- 1} t] \end{matrix} - - - (14)

Wherein C=σ ²i+ Φ A ^-1Φ ^t, t=[t1, t2 ..., t _n] ^t;

\begin{matrix} C = σ^{2} I + Φ A^{- 1} Φ^{T} = σ^{2} I + \underset{m &NotEqual; i}{Σ} α_{m}^{- 1} φ_{m} φ_{m}^{T} + α_{i}^{- 1} φ_{i} φ_{i}^{T} \\ = C_{- i} + α_{i}^{- 1} φ_{i} φ_{i}^{T} \end{matrix} - - - (15)

| C | = | C_{i} | | 1 + α_{i}^{- 1} φ_{i}^{T} C_{- i}^{- 1} φ_{i} |

C^{- 1} = C_{- i}^{- 1} - \frac{C_{- i}^{- 1} φ_{i} φ_{i}^{T} C_{- i}^{- 1}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}} - - - (16)

Therefore formula (14) can be rewritten as

\begin{matrix} L (α) = - \frac{1}{2} [N \log 2 π + \log | C | + t^{T} C_{- i} t - \log α_{i} + \log (α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}) - \frac{{(φ_{i}^{T} C_{- i}^{- 1} t)}^{2}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}}] \\ = L (α_{- i}) + \frac{1}{2} [\log α_{i} - \log (α_{i} + s_{i}) + \frac{{(q_{i})}^{2}}{α_{i} + s_{i}}] \\ = L (α_{- i}) + l (α_{i}) \end{matrix} - - - (17)

\frac{&PartialD; L (α)}{&PartialD; α_{i}} = \frac{&PartialD; l (α_{i})}{&PartialD; α_{i}} = \frac{1}{2} [\frac{1}{α_{i}} - \frac{1}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}} - \frac{{(φ_{i}^{T} C_{- i}^{- 1} t)}^{2}}{α_{i} + φ_{i}^{T} C_{- i}^{- 1} φ_{i}}] - - - (18)

Note

S_{i} = φ_{i}^{T} C_{- i}^{- 1} φ_{i}, Q_{i} = φ_{i}^{T} C_{- i}^{- 1} t - - - (19)

Institute with the formula (18) can be rewritten as

\frac{&PartialD; L (α)}{&PartialD; α_{i}} = \frac{α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})}{2 {(α_{i} + S_{i})}^{2}} - - - (20)

α_{i} = \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}} - - - (21)

To L (α) about α _isecond order local derviation is asked to have

\begin{matrix} \frac{{&PartialD;}^{2} L (α)}{&PartialD; α_{i}^{2}} = \frac{- α_{i}^{- 2} S_{i}^{2} {(α_{i} + S_{i})}^{2} - 2 (α_{i} + S_{i}) [α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})]}{2 {(α_{i} + S_{i})}^{4}} \\ = \frac{α_{i}^{- 2} S_{i}^{2}}{2 {(α_{i} + S_{i})}^{2}} - \frac{S_{i}^{2}}{α_{i} {(α_{i} + S_{i})}^{3}} - \frac{[α_{i}^{- 1} S_{i}^{2} - (Q_{i}^{2} - S_{i})]}{{(α_{i} + S_{i})}^{3}} \end{matrix} - - - (22)

It is known that analysis is carried out in aggregative formula (20) and (21)

\frac{{&PartialD;}^{2} L (α)}{&PartialD; α_{i}^{2}} |_{α_{i} = \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}}} = \frac{- S_{i}^{2}}{2 α_{i}^{2} {(α_{i} + S_{i})}^{2}} - - - (23)

α_{i} = \{\begin{matrix} \frac{S_{i}^{2}}{Q_{i}^{2} - S_{i}} & Q_{i}^{2} > S_{i} \\ \infty & Q_{i}^{2} \leq S_{i} \end{matrix} - - - (24)

Obtain the maximal value of L (α) thus.

4. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 3, is characterized in that, described Bayes L (α) maximizes by the following method:

5. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 3, described fast correlation vector machine, it is as follows that it returns rudimentary algorithm step:

I, initialization σ ²;

III, Σ, μ to all M basis function φ is calculated _minitialization S _mand Q _m;

V, calculate

θ_{i} = Q_{i}^{2} - S_{i};

If VIII θ _i≤ 0 and α _i< ∞, deletes φ _iand α is set _i=∞;

Ⅹ, covariance matrix Σ is recalculated, the S in weight matrix μ and corresponding iterative process _mand Q _m;