CN104680015A - Online soft measurement method for sewage treatment based on quick relevance vector machine - Google Patents

Online soft measurement method for sewage treatment based on quick relevance vector machine Download PDF

Info

Publication number
CN104680015A
CN104680015A CN201510093369.6A CN201510093369A CN104680015A CN 104680015 A CN104680015 A CN 104680015A CN 201510093369 A CN201510093369 A CN 201510093369A CN 104680015 A CN104680015 A CN 104680015A
Authority
CN
China
Prior art keywords
alpha
phi
sigma
model
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510093369.6A
Other languages
Chinese (zh)
Inventor
许玉格
曹涛
罗飞
宋亚龄
张雍涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201510093369.6A priority Critical patent/CN104680015A/en
Publication of CN104680015A publication Critical patent/CN104680015A/en
Pending legal-status Critical Current

Links

Landscapes

  • Activated Sludge Processes (AREA)

Abstract

The invention discloses an online soft measurement method for sewage treatment based on a quick relevance vector machine. The online soft measurement method sequentially comprises the following steps: estimating hyper-parameters by a quick marginal likelihood algorithm to obtain a weight value and a sample deviation value of a model; then establishing an online predication model of the quick relevance vector machine, optimizing model parameters and realizing accurate and quick measurement for BOD (Biological Oxygen Demand) in sewage. According to the measurement method disclosed by the invention, the real-time requirement can be met; an optimal predication model is established, so the prediction precision is improved, and the effect is obvious and the performances are improved; an online soft measurement model for sewage quality, established by the quick relevance vector machine, is high in prediction precision, high in generalization ability and short in updating time, and has important significance for reducing operating expenses of a sewage treatment plant, reflecting the condition of the sewage quality and realizing an automatic control system for the sewage treatment.

Description

A kind of wastewater treatment online soft sensor method based on fast correlation vector machine
Technical field
The present invention relates to sewage treatment area, particularly a kind of wastewater treatment online soft sensor method based on fast correlation vector machine.
Background technology
Wastewater treatment is the indispensable ingredient of economic development and fwaater resources protection.Along with the quick growth of national economy, quantity of wastewater effluent also increases greatly, and sewage treatment plant very little, and treatment cycle is oversize, does not reach the requirement of Environmental protection far away.The input of Environmental protection simultaneously strengthens, and sewage disposal technology more and more receives more concern.Clearly propose in national development planning to research and develop and promote low energy consumption, effective sewage disposal technology.
In sewage drainage standard, weigh whether parameter index up to standard has: chemical oxygen demand COD, biochemical oxygen demand BOD, ammonia nitrogen, phosphorus, solid suspension etc.Wherein biochemical oxygen demand BOD and chemical oxygen demand COD reflection water are by the program of organic contamination, and the ratio of BOD/COD has reflected the biodegrability of sewage.The measurement of these two parameters has very important value to control wastewater treatment.Chemical oxygen demand COD refers to, water sample under certain condition, to be oxidized the amount of the oxygenant that reducing substances consumes in 1 premium on currency sample for index, be converted to every premium on currency sample all oxidized after, the milligram number of the oxygen of needs, represents with mg/L.Biochemical oxygen demand BOD refers to and the dissolved oxygen content that microorganism is decomposed oxidation of organic compounds and consumes under certain temperature and time condition represents with mg/L.
Present wastewater treatment generally all adopts dilution method, sensor to measure the concentration of BOD, COD in sewage, but it is longer owing to analyzing the cycle measuring this two indices, often error is there is in measurement, the field condition of wastewater treatment can not be reacted in time, thus effluent control system also exists larger time delay, can not play the performance of its best.The present invention proposes the flexible measurement method of a kind of measurement BOD newly, improved by the training process of quick marginal likelihood algorithm to Method Using Relevance Vector Machine, can hyper parameter be made quickly to reach stationary value, thus obtain weighted value and deviate, and build online fast correlation vector machine soft-sensing model, realize the measurement to wastewater treatment water outlet BOD.
Summary of the invention
The object of the invention is to overcome the shortcoming of prior art and deficiency, a kind of wastewater treatment online soft sensor method based on fast correlation vector machine is provided.
Object of the present invention is realized by following technical scheme:
Based on a wastewater treatment online soft sensor method for fast correlation vector machine, comprise the step of following order:
A, estimate hyper parameter by quick marginal likelihood algorithm, obtain weighted value and the sample bias value of model;
B, then set up fast correlation vector machine on-line prediction model, to model parameter optimizing, achieve the accurate Quick Measurement of BOD in sewage.
The described wastewater treatment online soft sensor method based on fast correlation vector machine, specifically comprises following steps:
S1. reject the abnormity point in the data of input and output, due to the difference of each input variable dimension, it is normalized, normalize in [0,1] interval;
S2. given sewage data set { (x n, t n), n=1,2 ..., N}, x n∈ R d, t n∈ R, N are sample numbers, for simplicity, only consider scalar objective function, and we follow the new probability formula of standard, assuming that:
t n=y(x n;w)+ε n(1)
Wherein, y () is nonlinear function, ε nbe average be 0, variance is σ 2gaussian noise, namely therefore t is had n~ N (y (x n), σ 2), this shows t nmeet average y (x n) for variance be σ 2gaussian noise distribution, similar with support vector machine, function y (x) is defined as
y ( x ; w ) = Σ i = 1 N w i K ( x , x i ) + w 0 - - - ( 2 )
Wherein, by formula φ i(x)=K (x, x i) determine basis function, its core is by training vector parametrization, assuming that t nbe separate, then the likelihood function of whole training set can be written as
p ( t | w , σ 2 ) = ( 2 πσ 2 ) - N / 2 exp ( - | | t - Φw | | 2 2 σ 2 ) - - - ( 3 )
T=[t in formula 1, t 2..., t n] t, w=[w 0, w 1..., w m] t, Φ is the design matrix of a N × (N+1), Φ=[φ 1, φ 2..., φ m] be group nonlinear basis functions, φ (x n)=[1, K (x n, x 1), K (x n, x 2) ..., K (x n, x n)] t;
Owing to there being the much the same number of parameters with training sample in a model, w and σ obtained from (3) formula 2maximum likelihood estimator likely cause model over-fitting; In order to avoid overfitting, common way forces some restrictive conditions to parameter; Here, we are from Bayesian probability framework, carry out limiting parameter w and σ by defining a prior probability distribution 2;
Here select the function that is smoother, the prior probability distribution of definition w is the Gaussian distribution of zero-mean:
p ( w | α ) = Π j = 0 N N ( w j | 0 , α j - 1 ) - - - ( 4 )
In formula (4): hyper parameter α=[α 0, α 1..., α n] t, the more important thing is, each independently hyper parameter α jonly corresponding with it weight w jrelevant; By this restrictive condition, after the study of a large amount of sewage data, most of hyper parameter can level off to infinity, and the weights corresponding with it are 0, thus it is higher openness that RVM is had;
Define prior probability now, from Bayes rule, for unknown data in given data, Bayesian inference is by calculating posterior probability process:
p ( w , α , σ 2 | t ) = p ( t | w , α , σ 2 ) p ( w , α , σ 2 ) p ( t ) - - - ( 5 )
A given test point x *, corresponding sewage effluent quality predicted value t *prediction distribution be
p(t *|t)=∫p(t *|w,α,σ 2)p(w,α,σ 2|t)dwdαdσ 2(6)
According to Bayesian formula, the Posterior distrbutionp utilizing sample likelihood function (4) and w prior distribution (5) can obtain w is
p ( w | tα , β ) = p ( w | α ) p ( t | w , β ) p ( t | α , β ) - - - ( 7 )
We are decomposed into posterior probability
p(w,α,σ 2|t)=p(|w|t,α,σ 2)p(α,σ 2|t) (8)
Therefore to the Posterior probability distribution of weight be
p ( w | t , α , σ 2 ) = p ( t | w , σ 2 ) p ( w | α ) p ( t | α , σ 2 ) = ( 2 π ) - ( N + 1 ) / 2 | Σ | - 1 / 2 exp { - 1 2 ( w - μ ) T Σ - 1 ( w - μ ) } - - - ( 9 )
Its covariance is ∑=(σ -2Φ tΦ+A) -1(10)
Mean value is μ=σ -2∑ Φ tt (11)
Wherein matrix A=diag (α 0, α 1..., α n)
( α i ) new = γ i μ i 2 - - - ( 12 )
( σ 2 ) new = | | t - Φμ | | 2 N - Σ i γ i - - - ( 13 )
Wherein γ i≡ 1-α iii, ∑ iifor i-th diagonal element of covariance matrix ∑; Hyper parameter α and variances sigma is obtained to the iteration reasoning computing of formula (13) finally by formula (10) 2estimated value; The sewage quality predicted value exported is y *tφ (x *), x *it is sewage disposal process input value;
S3. hyper parameter is estimated with quick marginal likelihood algorithm
The problem large for Method Using Relevance Vector Machine complexity computing time, memory cost is large, have employed a kind of quick marginal likelihood algorithm; It dynamically expands basis matrix Φ in the training process from empty set, thus increases marginal likelihood function, or the row removing basis matrix Φ redundancy are to increase objective function;
S4. sewage sample data to be predicted is predicted: will enter water number according to the input as the Method Using Relevance Vector Machine soft-sensing model trained, the output of model is predicting the outcome of water outlet BOD.
Described step S3, specifically comprises the step of following order:
Method Using Relevance Vector Machine is by maximizing marginal likelihood function p (t| α, σ 2) method determination hyper parameter α and variances sigma 2, be equivalent to and maximumly turn to its logarithm; Note L (α)=log [p (t| α, σ 2)], arrangement has
L ( α ) = log [ p ( t | α , σ 2 ) ] = log [ ∫ p ( t | w , σ 2 ) p ( w | α ) ] = - 1 2 [ N log 2 π + log | C | + t T C - 1 t ] - - - ( 14 )
Wherein C=σ 2i+ Φ A -1Φ t, t=[t1, t2 ..., t n] t;
For the ease of maximizing L (α), equivalence transformation is carried out to Matrix C, as follows:
C = σ 2 I + ΦA - 1 Φ T = σ 2 I + Σ m ≠ i α m - 1 φ m φ m T + α i - 1 φ i φ i T = C - i + α i - 1 φ i φ i T - - - ( 15 )
Wherein this matrix representation works as α iduring=∞, corresponding base vector φ ibe removed the covariance matrix that rear sample is corresponding, arrange can obtain according to matrix correlation character
| C | = | C - i | | 1 + α i - 1 φ i T C - i - 1 φ i | C - 1 = C - i - 1 - C - i - 1 φ i φ i T C - i - 1 α i + φ i T C - i - 1 φ i - - - ( 16 )
Therefore formula (14) can be rewritten as
L ( α ) = - 1 2 [ N log 2 π + log | C | + t T C - i t - log α i + log ( α i + φ i T C - i - 1 φ i ) - ( φ i T C - i - 1 t ) 2 α i + φ i T C - i - 1 φ i ] = L ( α - i ) + 1 2 [ log α i - log ( α i + s i ) + ( q i ) 2 α i + s i = L ( α - i ) + l ( α i ) - - - ( 17 )
Note L (α -i) be expressed as and work as α iduring=∞, corresponding basis vector φ ibe removed the logarithm of rear corresponding border likelihood function, and l (α i) to represent in the logarithmic function of border likelihood only and α irelevant independent sector, above formula is to α ilocal derviation is asked to have
∂ L ( α ) ∂ α i = ∂ l ( α i ) ∂ α i = 1 2 [ 1 α i - 1 α i + φ i T C - i - 1 φ i - ( φ i T C - i - 1 t ) 2 α i + φ i T C - i - 1 φ i ] - - - ( 18 )
Note S i = φ i T C - i - 1 φ i , Q i = φ i T C - i - 1 t - - - ( 19 )
Institute with the formula (18) can be rewritten as
∂ L ( α ) ∂ α i = α i - 1 S i 2 - ( Q i 2 - S i ) 2 ( α i + S i ) 2 - - - ( 20 )
Make formula (20) equal zero, consider α ithat variance yields is just necessary for, so work as shi You
α i = S i 2 Q i 2 - S i - - - ( 21 )
To L (α) about α isecond order local derviation is asked to have
∂ 2 L ( α ) ∂ α i 2 = - α i - 2 S i 2 ( α i + S i ) 2 - 2 ( α i + S i ) [ α i - 1 S i 2 - ( Q i 2 - S i ) ] 2 ( α i + S i ) 4 = α i - 2 S i 2 2 ( α i + S i ) 2 - S i 2 α i ( α i + S i ) 3 - [ α i - 1 S i 2 - ( Q i 2 - S i ) ] ( α i + S i ) 3 - - - ( 22 )
It is known that analysis is carried out in aggregative formula (20) and (21)
∂ 2 L ( α ) ∂ α i 2 | α i = S i 2 Q i 2 - S i = - S i 2 2 α i 2 ( α i + S i ) 2 - - - ( 23 )
So work as time, the expression formula on formula (23) left side is permanent minus, and can obtain above derivation formula analysis, and L (α) has unique maximum of points to be
α i = S i 2 Q i 2 - S i Q i 2 > S i ∞ Q i 2 ≤ S i - - - ( 24 )
Obtain the maximal value of L (α) thus.
Described Bayes L (α) maximizes by the following method:
A, as base vector φ iin a model, i.e. α i< ∞, but have then by φ idelete from model, even α i=∞, can increase Bayes L (α) like this;
B, as base vector φ iin a model, i.e. α i=∞, but have then by φ ito add in model and to utilize formula (24) to upgrade α i, Bayes L (α) can be increased like this;
C, as base vector φ iin a model, i.e. α i< ∞, but have formula (24) is then used to upgrade α i, Bayes L (α) can be increased like this.
Described fast correlation vector machine, it is as follows that it returns rudimentary algorithm step:
I, initialization σ 2;
II, with single base vector φ iinitialization α i, can be obtained by formula (24) analysis and arrangement and other α is set m(m ≠ i) is infinitely great;
III, ∑, μ to all M basis function φ is calculated minitialization S mand Q m;
IV, from all M basis function φ mthe base vector φ of candidate is selected in set i;
V, calculate &theta; i &equiv; Q i 2 - S i ;
If VI θ i>0 and α i< ∞ (base vector φ iin a model), α is reappraised i;
If VII θ i>0 and α i=∞ (base vector φ inot in a model), φ is added ireappraise α in the model i;
If VIII θ i≤ 0 and α i< ∞, deletes φ iand α is set i=∞;
Ⅸ, estimating noise variance wherein N is data amount check, and M is basis function number;
Ⅹ, covariance matrix ∑ is recalculated, the S in weight matrix μ and corresponding iterative process mand Q m;
If Ⅺ restrains or reach maximum iteration time, then preserve weighted value and deviate, this trains end; Otherwise go to step IV and continue training.
Compared with prior art, tool has the following advantages and beneficial effect in the present invention:
1, the present invention adopts a kind of sewage effluent quality online soft sensor model based on fast correlation vector machine.First fast correlation vector machine off-line model is set up, then according to operating mode real-time update model, information data important in reserving model, and use quick marginal likelihood algorithm to accelerate the pace of learning of model, meet the requirement of real-time, set up optimum prediction model, precision of prediction is improved, Be very effective, performance is improved.The sewage quality online soft sensor model prediction accuracy that fast correlation vector machine is set up is high, generalization ability is strong, update time is short, for saving sewage treatment plant running cost, real-time reflection sewage quality situation, has great importance to Water Treatment Automatic Control System.
2, the basic thought of hard measurement is that Theory of Automatic Control and production run knowledge are combined, Applied Computer Techniques, measure or temporary transient immeasurable significant variable (or being referred to as leading variable) for being difficult to, select the variable (or being referred to as auxiliary variable) that other is easily measured, by forming certain mathematical relation to infer and estimating, replace hardware (sensor) function with software.This method response rapidly, can provide leading variable information continuously, and have the advantages such as investment is low, maintaining is simple.Soft-measuring technique is used for sewage disposal process, the energy consumption of sewage treatment plant can be reduced, plant maintenance expense etc., and have comparatively positive interaction for the error that sensor measurement produces.But traditional off-line soft-sensing model once trains foundation according to a large amount of data, can not change afterwards again, so just may cause new data fitness so not strong.Online soft-sensing model is then first set up off-line model, then also makes corresponding changes according to the variation model of operating mode, and keep near-synchronous to upgrade with up-to-date data all the time, such precision of prediction have also been obtained corresponding raising.
Accompanying drawing explanation
Fig. 1 is model on-time model water outlet BOD result fitted figure of the present invention;
Fig. 2 is model off-line training weighted value distribution plan of the present invention;
Fig. 3 is model off-line model water outlet BOD result fitted figure of the present invention.
Embodiment
Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited thereto.
The present invention adopts fast correlation vector machine to set up off-line model; Utilize the training process of quick marginal likelihood algorithm improvement model, the on-line training speed of model can be accelerated, make hyper parameter reach stationary value quickly.Its algorithm steps is as follows
Algorithm steps:
S1. initialization σ 2;
S2. with single base vector φ iinitialization α i, can be obtained by formula (24) analysis and arrangement and other α is set m(m ≠ i) is infinitely great;
S3. ∑, μ to all M basis function φ is calculated minitialization S mand Q m;
S4. from all M basis function φ mthe base vector φ of candidate is selected in set i;
S5. calculate &theta; i &equiv; Q i 2 - S i ;
If S6. θ i>0 and α i< ∞ (base vector φ iin a model), α is reappraised i;
If S7. θ i>0 and α i=∞ (base vector φ inot in a model), φ is added ireappraise α in the model i;
If S8. θ i≤ 0 and α i< ∞, deletes φ iand α is set i=∞;
S9. estimating noise variance wherein N is data amount check, and M is basis function number;
S10. covariance matrix ∑ is recalculated, the S in weight matrix μ and corresponding iterative process mand Q m;
If S11. restrain or reach maximum iteration time, then preserve weighted value and deviate, this trains end; Otherwise turn S4 and continue training.
Sewage is data from the sewage data in University of California's database (UCI).BOD is that reflection water body is by the overall target of Organic Pollution degree.With suspended sediment concentration, the chemical oxygen demand (COD) of Inlet and outlet water, the BOD of water inlet, flow, the closely related temperature of the variablees such as pH value.Auxiliary variable needed for modeling is degradable solid concentration RD-SED-G, suspended sediment concentration RD-SS-G, biochemical oxygen demand RD-DBO-G, chemical oxygen demand (COD) RD-DQO-G, biochemical oxygen demand RD-DBO-P in preliminary sedimentation tank, suspended sediment concentration RD-SS-P, biochemical oxygen demand RD-DBO-S in second pond, chemical oxygen demand (COD) RD-DQO-S, enter the biochemical oxygen demand DBO in water, chemical oxygen demand (COD) DQO, chemical oxygen demand (COD) DQO in two stage treatment, biochemical oxygen demand DBO, suspended sediment concentration SS, pH value PH-S, degradable solid concentration SED, the chemical oxygen demand (COD) DQO-S of water outlet, degradable solid concentration SED-S, suspended sediment concentration SS-S, pH value PH-S.As from the foregoing, input attributes 19, output attribute 1.Choose 400 groups of data after process, wherein 200 groups are used for training pattern, and 200 groups are used as new data test model precision.
(2) RVM modeling procedure is as follows:
Sewage data set { (x n, t n), n=1,2 ..., N}, x n∈ R d, t n∈ R, N are sample numbers, assuming that:
t n=y(x n;w)+ε n(1)
Wherein, y () is nonlinear function, ε nbe average be 0, variance is σ 2gaussian noise, namely therefore t is had n~ N (y (x n), σ 2), function y (x) is defined as
y ( x ; w ) = &Sigma; i = 1 N w i K ( x , x i ) + w 0 - - - ( 2 )
In formula, determine basis function, its core is by training vector parametrization φ i(x)=K (x, x i). assuming that t nbe separate, then the likelihood function of whole training set can be written as
p ( t | w , &sigma; 2 ) = ( 2 &pi;&sigma; 2 ) - N / 2 exp ( - | | t - &Phi;w | | 2 2 &sigma; 2 ) - - - ( 3 )
T=[t in formula 1, t 2..., t n] t, w=[w 0, w 1..., w m] t, Φ is the design matrix of a N × (N+1), Φ=[φ 1, φ 2..., φ m] be group nonlinear basis functions, φ (x n)=[1, K (x n, x 1), K (x n, x 2) ..., K (x n, x n)] t.
Owing to there being the much the same number of parameters with training sample in a model, w and σ obtained from (3) formula 2maximum likelihood estimator likely cause model over-fitting. in order to avoid overfitting, common way forces some restrictive conditions to parameter. here, we are from Bayesian probability framework, limit these parameters by defining a prior probability distribution
Here select the function that is smoother, the prior probability distribution of definition w is the Gaussian distribution of zero-mean:
p ( w | &alpha; ) = &Pi; j = 0 N N ( w j | 0 , &alpha; j - 1 ) - - - ( 4 )
In formula (4): hyper parameter α=[α 0, α 1..., α n] t, the more important thing is, each independently hyper parameter α jonly corresponding with it weight w jrelevant. by this restrictive condition, after the study of a large amount of sewage data, most of hyper parameter can level off to infinity, and the weights corresponding with it are 0, thus it is higher openness that RVM is had.
From Bayes rule, for unknown data in given data, Bayesian inference passes through to calculate posterior probability process,
p ( w , &alpha; , &sigma; 2 | t ) = p ( t | w , &alpha; , &sigma; 2 ) p ( w , &alpha; , &sigma; 2 ) p ( t ) - - - ( 5 )
A given test point x *, corresponding sewage effluent quality predicted value t *prediction distribution be
p(t *|t)=∫p(t *|w,α,σ 2)p(w,α,σ 2|t)dwdαdσ 2(6)
According to Bayesian formula, the Posterior distrbutionp utilizing sample likelihood function (4) and w prior distribution (5) can obtain w is
p ( w | t&alpha; , &beta; ) = p ( w | &alpha; ) p ( t | w , &beta; ) p ( t | &alpha; , &beta; ) - - - ( 7 )
We are decomposed into posterior probability
p(w,α,σ 2|t)=p(|w|t,α,σ 2)p(α,σ 2|t) (8)
Therefore to the Posterior probability distribution of weight be
p ( w | t , &alpha; , &sigma; 2 ) = p ( t | w , &sigma; 2 ) p ( w | &alpha; ) p ( t | &alpha; , &sigma; 2 ) = ( 2 &pi; ) - ( N + 1 ) / 2 | &Sigma; | - 1 / 2 exp { - 1 2 ( w - &mu; ) T &Sigma; - 1 ( w - &mu; ) } - - - ( 9 )
Its covariance is
∑=(σ -2Φ TΦ+A) -1(10)
Mean value is
μ=σ -2∑Φ Tt (11)
Wherein matrix A=diag (α 0, α 1..., α n)
( &alpha; i ) new = &gamma; i &mu; i 2 - - - ( 12 )
( &sigma; 2 ) new = | | t - &Phi;&mu; | | 2 N - &Sigma; i &gamma; i - - - ( 13 )
Wherein γ i≡ 1-α iii, ∑ iifor i-th diagonal element of covariance matrix ∑. the iteration reasoning computing finally by (10) to (13) formula obtains hyper parameter α and variances sigma 2estimated value.
The sewage quality predicted value exported is y *tφ (x *), x *it is sewage disposal process input value.Off-line model predicts the outcome as shown in Figure 3.Weights distribution as shown in Figure 2.
(3) suppose that up-to-date sewage input attributes is x new, actual water outlet BOD value is y new, then the online updating algorithm following steps of water outlet BOD:
S1. use the described fast correlation vector machine of upper joint to return rudimentary algorithm step according to historical data and set up initial model;
If S2. carried out new data, then to new sewage data x newprediction is carried out and calculation deviation with formula (2); Otherwise, forward S8 to;
S3. new samples (x is added new, y new) in model, also initializes weights w is 0, α newbe initialized as 1;
S5. estimating noise variance wherein N is data amount check, and M is basis function number;
S6. ∑ is recalculated, the S in μ and corresponding iterative process mand Q m, computing formula is respectively ∑=(σ -2Φ tΦ+A) -1, μ=σ -2∑ Φ tt,
If S7. restrain or reach maximum iteration time, then preserve weighted value and deviation, turn S2 and continue predicted data; Otherwise turn S4.
S8. EOP (end of program).
On-time model actual motion predicts the outcome as shown in Figure 1.
Above-described embodiment is the present invention's preferably embodiment; but embodiments of the present invention are not restricted to the described embodiments; change, the modification done under other any does not deviate from Spirit Essence of the present invention and principle, substitute, combine, simplify; all should be the substitute mode of equivalence, be included within protection scope of the present invention.

Claims (5)

1., based on a wastewater treatment online soft sensor method for fast correlation vector machine, it is characterized in that, comprise the step of following order:
A, estimate hyper parameter by quick marginal likelihood algorithm, obtain weighted value and the sample bias value of model;
B, then set up fast correlation vector machine on-line prediction model, to model parameter optimizing, achieve the accurate Quick Measurement of BOD in sewage.
2. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 1, is characterized in that, specifically comprise following steps:
S1. reject the abnormity point in the data of input and output, due to the difference of each input variable dimension, it is normalized, normalize in [0,1] interval;
S2. given sewage data set { (x n, t n), n=1,2 ..., N}, x n∈ R d, t n∈ R, N are sample numbers, for simplicity, only consider scalar objective function, and we follow the new probability formula of standard, assuming that:
t n=y(x n;w)+ε n(1)
Wherein, y () is nonlinear function, ε nbe average be 0, variance is σ 2gaussian noise, namely therefore t is had n~ N (y (x n), σ 2), this shows t nmeet average y (x n) for variance be σ 2gaussian noise distribution, similar with support vector machine, function y (x) is defined as
y ( x ; w ) = &Sigma; i = 1 N w i K ( x , x i ) + w 0 - - - ( 2 )
Wherein, by formula φ i(x)=K (x, x i) determine basis function, its core is by training vector parametrization, assuming that t nbe separate, then the likelihood function of whole training set can be written as
p ( t | w , &sigma; 2 ) = ( 2 &pi; &sigma; 2 ) - N / 2 exp ( - | | t - &Phi;w | | 2 2 &sigma; 2 ) - - - ( 3 )
T=[t in formula 1, t 2..., t n] t, w=[w 0, w 1..., w m] t, Φ is the design matrix of a N × (N+1), Φ=[φ 1, φ 2..., φ m] be group nonlinear basis functions, φ (x n)=[1, K (x n, x 1), K (x n, x 2) ..., K (x n, x n)] t;
Owing to there being the much the same number of parameters with training sample in a model, w and σ obtained from (3) formula 2maximum likelihood estimator likely cause model over-fitting; In order to avoid overfitting, common way forces some restrictive conditions to parameter; Here, we are from Bayesian probability framework, carry out limiting parameter w and σ by defining a prior probability distribution 2;
Here select the function that is smoother, the prior probability distribution of definition w is the Gaussian distribution of zero-mean:
p ( w | &alpha; ) = &Pi; j = 0 N N ( w j | 0 , &alpha; j - 1 ) - - - ( 4 )
In formula (4): hyper parameter α=[α 0, α 1..., α n] t, the more important thing is, each independently hyper parameter α jonly corresponding with it weight w jrelevant; By this restrictive condition, after the study of a large amount of sewage data, most of hyper parameter can level off to infinity, and the weights corresponding with it are 0, thus it is higher openness that RVM is had;
Define prior probability now, from Bayes rule, for unknown data in given data, Bayesian inference is by calculating posterior probability process:
p ( w , &alpha; , &sigma; 2 | t ) = p ( t | w , &alpha; , &sigma; 2 ) p ( w , &alpha; , &sigma; 2 ) p ( t ) - - - ( 5 )
A given test point x *, corresponding sewage effluent quality predicted value t *prediction distribution be
p(t *|t)=∫p(t *|w,α,σ 2)p(w,α,σ 2|t)dwdαdσ 2(6)
According to Bayesian formula, the Posterior distrbutionp utilizing sample likelihood function (4) and w prior distribution (5) can obtain w is
p ( w | t&alpha; , &beta; ) = p ( w | &alpha; ) p ( t | w , &beta; ) p ( t | &alpha; , &beta; ) - - - ( 7 )
We are decomposed into posterior probability
p(w,α,σ 2|t)=p(|w|t,α,σ 2)p(α,σ 2|t) (8)
Therefore to the Posterior probability distribution of weight be
p ( w | t , &alpha; , &sigma; 2 ) = p ( t | w , &sigma; 2 ) p ( w | &alpha; ) p ( t | &alpha; , &sigma; 2 ) = ( 2 &pi; ) - ( N + 1 ) / 2 | &Sigma; | - 1 / 2 exp { - 1 2 ( w - &mu; ) T &Sigma; - 1 ( w - &mu; ) } - - - ( 9 )
Its covariance is Σ=(σ -2Φ tΦ+A) -1(10)
Mean value is μ=σ -2Σ Φ tt (11)
Wherein matrix A=diag (α 0, α 1..., α n)
( &alpha; i ) new = &gamma; i &mu; i 2 - - - ( 12 )
( &sigma; 2 ) new = | | t - &Phi;&mu; | | 2 N - &Sigma; i &gamma; i - - - ( 13 )
Wherein γ i≡ 1-α iΣ ii, Σ iifor i-th diagonal element of covariance matrix Σ; Hyper parameter α and variances sigma is obtained to the iteration reasoning computing of formula (13) finally by formula (10) 2estimated value; The sewage quality predicted value exported is y *tφ (x *), x *it is sewage disposal process input value;
S3. hyper parameter is estimated with quick marginal likelihood algorithm
The problem large for Method Using Relevance Vector Machine complexity computing time, memory cost is large, have employed a kind of quick marginal likelihood algorithm; It dynamically expands basis matrix Φ in the training process from empty set, thus increases marginal likelihood function, or the row removing basis matrix Φ redundancy are to increase objective function;
S4. sewage sample data to be predicted is predicted: will enter water number according to the input as the Method Using Relevance Vector Machine soft-sensing model trained, the output of model is predicting the outcome of water outlet BOD.
3. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 2, is characterized in that described step S3 specifically comprises the step of following order:
Method Using Relevance Vector Machine is by maximizing marginal likelihood function p (t| α, σ 2) method determination hyper parameter α and variances sigma 2, be equivalent to and maximumly turn to its logarithm; Note L (α)=log [p (t| α, σ 2)], arrangement has
L ( &alpha; ) = log [ p ( t | &alpha; , &sigma; 2 ) ] = log [ &Integral; p ( t | w , &sigma; 2 ) p ( w | &alpha; ) ] = - 1 2 [ N log 2 &pi; + log | C | + t T C - 1 t ] - - - ( 14 )
Wherein C=σ 2i+ Φ A -1Φ t, t=[t1, t2 ..., t n] t;
For the ease of maximizing L (α), equivalence transformation is carried out to Matrix C, as follows:
C = &sigma; 2 I + &Phi; A - 1 &Phi; T = &sigma; 2 I + &Sigma; m &NotEqual; i &alpha; m - 1 &phi; m &phi; m T + &alpha; i - 1 &phi; i &phi; i T = C - i + &alpha; i - 1 &phi; i &phi; i T - - - ( 15 )
Wherein this matrix representation works as α iduring=∞, corresponding base vector φ ibe removed the covariance matrix that rear sample is corresponding, arrange can obtain according to matrix correlation character
| C | = | C i | | 1 + &alpha; i - 1 &phi; i T C - i - 1 &phi; i |
C - 1 = C - i - 1 - C - i - 1 &phi; i &phi; i T C - i - 1 &alpha; i + &phi; i T C - i - 1 &phi; i - - - ( 16 )
Therefore formula (14) can be rewritten as
L ( &alpha; ) = - 1 2 [ N log 2 &pi; + log | C | + t T C - i t - log &alpha; i + log ( &alpha; i + &phi; i T C - i - 1 &phi; i ) - ( &phi; i T C - i - 1 t ) 2 &alpha; i + &phi; i T C - i - 1 &phi; i ] = L ( &alpha; - i ) + 1 2 [ log &alpha; i - log ( &alpha; i + s i ) + ( q i ) 2 &alpha; i + s i ] = L ( &alpha; - i ) + l ( &alpha; i ) - - - ( 17 )
Note L (α -i) be expressed as and work as α iduring=∞, corresponding basis vector φ ibe removed the logarithm of rear corresponding border likelihood function, and l (α i) to represent in the logarithmic function of border likelihood only and α irelevant independent sector, above formula is to α ilocal derviation is asked to have
&PartialD; L ( &alpha; ) &PartialD; &alpha; i = &PartialD; l ( &alpha; i ) &PartialD; &alpha; i = 1 2 [ 1 &alpha; i - 1 &alpha; i + &phi; i T C - i - 1 &phi; i - ( &phi; i T C - i - 1 t ) 2 &alpha; i + &phi; i T C - i - 1 &phi; i ] - - - ( 18 )
Note S i = &phi; i T C - i - 1 &phi; i , Q i = &phi; i T C - i - 1 t - - - ( 19 )
Institute with the formula (18) can be rewritten as
&PartialD; L ( &alpha; ) &PartialD; &alpha; i = &alpha; i - 1 S i 2 - ( Q i 2 - S i ) 2 ( &alpha; i + S i ) 2 - - - ( 20 )
Make formula (20) equal zero, consider α ithat variance yields is just necessary for, so work as shi You
&alpha; i = S i 2 Q i 2 - S i - - - ( 21 )
To L (α) about α isecond order local derviation is asked to have
&PartialD; 2 L ( &alpha; ) &PartialD; &alpha; i 2 = - &alpha; i - 2 S i 2 ( &alpha; i + S i ) 2 - 2 ( &alpha; i + S i ) [ &alpha; i - 1 S i 2 - ( Q i 2 - S i ) ] 2 ( &alpha; i + S i ) 4 = &alpha; i - 2 S i 2 2 ( &alpha; i + S i ) 2 - S i 2 &alpha; i ( &alpha; i + S i ) 3 - [ &alpha; i - 1 S i 2 - ( Q i 2 - S i ) ] ( &alpha; i + S i ) 3 - - - ( 22 )
It is known that analysis is carried out in aggregative formula (20) and (21)
&PartialD; 2 L ( &alpha; ) &PartialD; &alpha; i 2 | &alpha; i = S i 2 Q i 2 - S i = - S i 2 2 &alpha; i 2 ( &alpha; i + S i ) 2 - - - ( 23 )
So work as time, the expression formula on formula (23) left side is permanent minus, and can obtain above derivation formula analysis, and L (α) has unique maximum of points to be
&alpha; i = S i 2 Q i 2 - S i Q i 2 > S i &infin; Q i 2 &le; S i - - - ( 24 )
Obtain the maximal value of L (α) thus.
4. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 3, is characterized in that, described Bayes L (α) maximizes by the following method:
A, as base vector φ iin a model, i.e. α i< ∞, but have then by φ idelete from model, even α i=∞, can increase Bayes L (α) like this;
B, as base vector φ iin a model, i.e. α i=∞, but have then by φ ito add in model and to utilize formula (24) to upgrade α i, Bayes L (α) can be increased like this;
C, as base vector φ iin a model, i.e. α i< ∞, but have formula (24) is then used to upgrade α i, Bayes L (α) can be increased like this.
5. the wastewater treatment online soft sensor method based on fast correlation vector machine according to claim 3, described fast correlation vector machine, it is as follows that it returns rudimentary algorithm step:
I, initialization σ 2;
II, with single base vector φ iinitialization α i, can be obtained by formula (24) analysis and arrangement and other α is set m(m ≠ i) is infinitely great;
III, Σ, μ to all M basis function φ is calculated minitialization S mand Q m;
IV, from all M basis function φ mthe base vector φ of candidate is selected in set i;
V, calculate &theta; i = Q i 2 - S i ;
If VI θ i>0 and α i< ∞ (base vector φ iin a model), α is reappraised i;
If VII θ i>0 and α i=∞ (base vector φ inot in a model), φ is added ireappraise α in the model i;
If VIII θ i≤ 0 and α i< ∞, deletes φ iand α is set i=∞;
Ⅸ, estimating noise variance wherein N is data amount check, and M is basis function number;
Ⅹ, covariance matrix Σ is recalculated, the S in weight matrix μ and corresponding iterative process mand Q m;
If Ⅺ restrains or reach maximum iteration time, then preserve weighted value and deviate, this trains end; Otherwise go to step IV and continue training.
CN201510093369.6A 2015-03-02 2015-03-02 Online soft measurement method for sewage treatment based on quick relevance vector machine Pending CN104680015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510093369.6A CN104680015A (en) 2015-03-02 2015-03-02 Online soft measurement method for sewage treatment based on quick relevance vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510093369.6A CN104680015A (en) 2015-03-02 2015-03-02 Online soft measurement method for sewage treatment based on quick relevance vector machine

Publications (1)

Publication Number Publication Date
CN104680015A true CN104680015A (en) 2015-06-03

Family

ID=53315048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510093369.6A Pending CN104680015A (en) 2015-03-02 2015-03-02 Online soft measurement method for sewage treatment based on quick relevance vector machine

Country Status (1)

Country Link
CN (1) CN104680015A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104914227A (en) * 2015-06-16 2015-09-16 华南理工大学 Multi-gaussian kernel self-optimization relevance vector machine based wastewater quality soft-measurement method
CN105023071A (en) * 2015-08-14 2015-11-04 中国科学院重庆绿色智能技术研究院 Water quality prediction method based on Gaussian cloud transformation and fuzzy time sequence
CN105243256A (en) * 2015-08-27 2016-01-13 肖红军 Biochemical oxygen demand parameter online soft measurement method
CN105487526A (en) * 2016-01-04 2016-04-13 华南理工大学 FastRVM (fast relevance vector machine) wastewater treatment fault diagnosis method
CN105740619A (en) * 2016-01-28 2016-07-06 华南理工大学 On-line fault diagnosis method of weighted extreme learning machine sewage treatment on the basis of kernel function
CN106021924A (en) * 2016-05-19 2016-10-12 华南理工大学 Sewage online soft-measurement method based on multi-attribute Gaussian kernel function fast relevance vector machine
CN106681305A (en) * 2017-01-03 2017-05-17 华南理工大学 Online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment
CN110110890A (en) * 2019-03-28 2019-08-09 杭州电子科技大学 Day wastewater quantity prediction method based on ELMAN neural network
CN110197022A (en) * 2019-05-21 2019-09-03 浙江大学 Parallel probability variation soft-measuring modeling method towards streaming big data
CN110717601A (en) * 2019-10-15 2020-01-21 厦门铅笔头信息科技有限公司 Anti-fraud method based on supervised learning and unsupervised learning
CN111523676A (en) * 2020-04-17 2020-08-11 第四范式(北京)技术有限公司 Method and device for assisting machine learning model to be online

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140099971A1 (en) * 2012-10-10 2014-04-10 University-Industry Cooperation Group Of Kyunghee University Apparatus and method for measuring location of user equipment located indoors in wireless network
CN103793604A (en) * 2014-01-25 2014-05-14 华南理工大学 Sewage treatment soft measuring method based on RVM

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140099971A1 (en) * 2012-10-10 2014-04-10 University-Industry Cooperation Group Of Kyunghee University Apparatus and method for measuring location of user equipment located indoors in wireless network
CN103793604A (en) * 2014-01-25 2014-05-14 华南理工大学 Sewage treatment soft measuring method based on RVM

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘俊顺: "基于聚类的相关向量机快速分类算法研究", 《中国优秀硕士学位论文全文数据库》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104914227A (en) * 2015-06-16 2015-09-16 华南理工大学 Multi-gaussian kernel self-optimization relevance vector machine based wastewater quality soft-measurement method
CN104914227B (en) * 2015-06-16 2016-10-05 华南理工大学 Sewage quality flexible measurement method based on many gaussian kernel self-optimizing Method Using Relevance Vector Machine
CN105023071A (en) * 2015-08-14 2015-11-04 中国科学院重庆绿色智能技术研究院 Water quality prediction method based on Gaussian cloud transformation and fuzzy time sequence
CN105243256A (en) * 2015-08-27 2016-01-13 肖红军 Biochemical oxygen demand parameter online soft measurement method
CN105487526A (en) * 2016-01-04 2016-04-13 华南理工大学 FastRVM (fast relevance vector machine) wastewater treatment fault diagnosis method
CN105487526B (en) * 2016-01-04 2019-04-09 华南理工大学 A kind of Fast RVM sewage treatment method for diagnosing faults
CN105740619B (en) * 2016-01-28 2018-06-12 华南理工大学 Weighting extreme learning machine sewage disposal on-line fault diagnosis method based on kernel function
CN105740619A (en) * 2016-01-28 2016-07-06 华南理工大学 On-line fault diagnosis method of weighted extreme learning machine sewage treatment on the basis of kernel function
CN106021924A (en) * 2016-05-19 2016-10-12 华南理工大学 Sewage online soft-measurement method based on multi-attribute Gaussian kernel function fast relevance vector machine
CN106021924B (en) * 2016-05-19 2019-01-18 华南理工大学 Sewage online soft sensor method based on more attribute gaussian kernel function fast correlation vector machines
CN106681305A (en) * 2017-01-03 2017-05-17 华南理工大学 Online fault diagnosing method for Fast RVM (relevance vector machine) sewage treatment
CN110110890A (en) * 2019-03-28 2019-08-09 杭州电子科技大学 Day wastewater quantity prediction method based on ELMAN neural network
CN110197022A (en) * 2019-05-21 2019-09-03 浙江大学 Parallel probability variation soft-measuring modeling method towards streaming big data
CN110717601A (en) * 2019-10-15 2020-01-21 厦门铅笔头信息科技有限公司 Anti-fraud method based on supervised learning and unsupervised learning
CN110717601B (en) * 2019-10-15 2022-05-03 厦门铅笔头信息科技有限公司 Anti-fraud method based on supervised learning and unsupervised learning
CN111523676A (en) * 2020-04-17 2020-08-11 第四范式(北京)技术有限公司 Method and device for assisting machine learning model to be online
CN111523676B (en) * 2020-04-17 2024-04-12 第四范式(北京)技术有限公司 Method and device for assisting machine learning model to be online

Similar Documents

Publication Publication Date Title
CN104680015A (en) Online soft measurement method for sewage treatment based on quick relevance vector machine
US10570024B2 (en) Method for effluent total nitrogen-based on a recurrent self-organizing RBF neural network
US11346831B2 (en) Intelligent detection method for biochemical oxygen demand based on a self-organizing recurrent RBF neural network
Bagherzadeh et al. Prediction of energy consumption and evaluation of affecting factors in a full-scale WWTP using a machine learning approach
CN107358021B (en) DO prediction model establishment method based on BP neural network optimization
CN108469507B (en) Effluent BOD soft measurement method based on self-organizing RBF neural network
CN104182794B (en) Method for soft measurement of effluent total phosphorus in sewage disposal process based on neural network
CN101799888B (en) Industrial soft measurement method based on bionic intelligent ant colony algorithm
CN102854296A (en) Sewage-disposal soft measurement method on basis of integrated neural network
Hansen et al. Modeling phosphorous dynamics in a wastewater treatment process using Bayesian optimized LSTM
CN109828089A (en) A kind of on-line prediction method of the water quality parameter cultured water based on DBN-BP
CN109492265A (en) The kinematic nonlinearity PLS soft-measuring modeling method returned based on Gaussian process
CN114037163A (en) Sewage treatment effluent quality early warning method based on dynamic weight PSO (particle swarm optimization) optimization BP (Back propagation) neural network
CN103793604A (en) Sewage treatment soft measuring method based on RVM
CN104914227B (en) Sewage quality flexible measurement method based on many gaussian kernel self-optimizing Method Using Relevance Vector Machine
Zhao et al. A soft measurement approach of wastewater treatment process by lion swarm optimizer-based extreme learning machine
CN111125907B (en) Sewage treatment ammonia nitrogen soft measurement method based on hybrid intelligent model
Wang et al. A full-view management method based on artificial neural networks for energy and material-savings in wastewater treatment plants
Qiao et al. A repair algorithm for radial basis function neural network and its application to chemical oxygen demand modeling
CN109408896B (en) Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production
CN114330815A (en) Ultra-short-term wind power prediction method and system based on improved GOA (generic object oriented architecture) optimized LSTM (least Square TM)
CN112001436A (en) Water quality classification method based on improved extreme learning machine
Chang et al. Soft measurement of effluent index in sewage treatment process based on overcomplete broad learning system
CN110837886A (en) Effluent NH4-N soft measurement method based on ELM-SL0 neural network
CN113838542B (en) Intelligent prediction method and system for chemical oxygen demand

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150603

WD01 Invention patent application deemed withdrawn after publication