CN103942457B - Water quality parameter time series prediction method based on relevance vector machine regression - Google Patents
Water quality parameter time series prediction method based on relevance vector machine regression Download PDFInfo
- Publication number
- CN103942457B CN103942457B CN201410196457.4A CN201410196457A CN103942457B CN 103942457 B CN103942457 B CN 103942457B CN 201410196457 A CN201410196457 A CN 201410196457A CN 103942457 B CN103942457 B CN 103942457B
- Authority
- CN
- China
- Prior art keywords
- water quality
- quality parameter
- time series
- value
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a water quality parameter time series prediction method based on relevance vector machine regression. The water quality parameter time series prediction method comprises the following steps of 1 acquiring water quality parameter historical data from an automatic water quality monitoring station and performing data pre-processing; 2 using front 2/3 data in the pro-processed water quality parameter historical data as a training sample set and using rear 1/3 data as a testing sample set; 3 using the training sample set to train an RVM, using the testing sample set to test the trained RVM so as to obtain a water quality parameter time series prediction model based on the RVM regression; 4 using the water quality parameter time series prediction model based on the RVM regression to predict new water quality parameters. The water quality parameter time series prediction method can perform time series prediction, is large in prediction range, high in accuracy and good in prediction stability, and can provide probabilistic output, give a predicted confidence interval while performing prediction, reduce the prediction time and timely observe water quality parameter change.
Description
Technical field
The present invention relates to water quality monitoring field, and in particular to the water quality parameter time series returned based on interconnection vector machine is pre-
Survey method.
Background technology
Water quality parameter time series is an orderly Monitoring Data sequence, and it embodies certain water quality parameter in time
Distribution situation, such as Monitoring Data of certain basin section in the water quality parameter pH value of the 1st week to the 50th week certain year.During water quality parameter
Between sequence prediction method be then the inherent statistics of the historical data in analysis set using acquired historical time arrangement set
Characteristic and the rule of development are learned, water quality parameter time series predicting model is set up, and obtains prediction data to show using the model
The development trend of Future Data.Water quality parameter time series forecasting, is the element task of water environment management and Environmental capacity.At present
Information and technical support of China's water pollution accident due to shortage early stage, is mostly to count afterwards, it is impossible to predict the change of water quality
Change and avoid the generation of contamination accident.Therefore, it is water environment in recent years to set up reliable water quality parameter time series predicting model
One of study hotspot of scientific domain.At present water quality parameter time series predicting model common both at home and abroad is mainly artificial neuron
Network and SVMs (Support Vector Machine, SVM) regression time series forecast model, but ANN
Network algorithm easily occurred learning or owe study, local minimum, network structure be difficult to determine, the problems such as Generalization Ability difference, and
SVM regression models are a kind of supervised study side of foundation on the basis of Statistical Learning Theory and structural risk minimization
Method, the method will be originally inputted the high-dimensional feature space for being mapped to linear separability by kernel function, with generalization ability it is strong, be difficult
The advantages of generation over-fitting, the problems such as can preferably solve small sample, non-linear, high dimension drawn game portion minimal point, with artificial god
Jing network time sequential forecasting models are compared, and are increased based on the time series predicting model performance of SVM, but in the SVM times
In sequential forecasting models, kernel function must is fulfilled for Mercer conditions, and the number of supporting vector can be in the increase of training sample
It is linearly increasing, and be only given it is deterministic predict the outcome, without probability output, it is impossible to estimate the uncertainty of prediction, and in reality
The prediction of probabilistic type in the application of border can provide important information, aid in determining whether the confidence level that water quality parameter is predicted.Associate to
Amount machine (Relevance Vector Machine, RVM) is that Tipping is proposed in calendar year 2001 on the basis of Bayesian frame
A kind of newer machine learning algorithm, its kernel function need not meet Mercer conditions, and the openness of solution is also far above SVM, and energy
The probabilistic information of prediction is enough provided, with preferable generalization ability, RVM is solving many realities such as pattern-recognition and regression estimates
Applied in the problem of border, and achieved good effect.The Chinese patent of Application No. 20131013190.7 provides one
Sewage quality monitoring method and device are planted, the forecast model that the method is adopted is based on the flexible measurement method of interconnection vector machine, phase
Than the model set up using neutral net and model construction of SVM method, the essence of the prediction with preferably applicability and Geng Gao
Degree, but the method has following defect:One:It is analyzed and then obtains water outlet total nitrogen or water outlet due to uses relevant parameter
The content of total phosphorus, the uncertainty of related data understands the data result of its final output of extreme influence, although data output is compared
The model set up with neutral net and model construction of SVM method is greatly improved, but the data of its final output
As a result unstability is still present;Two:Simply merely analyze the content of water outlet total nitrogen or water outlet total phosphorus at that time, it is impossible to realize
Time series forecasting, the scope of prediction is little, and precision is low.
The content of the invention
The technical problem to be solved is to provide the water quality parameter time series returned based on interconnection vector machine
Forecasting Methodology, can carry out time series forecasting, and the scope of prediction is big, high precision, the good stability of prediction, and can be given
Probability output, while prediction is provided, provides the confidential interval of prediction, reduces predicted time, and water quality parameter is observed in time
Change.
To solve above-mentioned existing technical problem, the present invention adopts following scheme:Based on the water quality that interconnection vector machine is returned
Parameter time series Forecasting Methodology, comprises the following steps:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by history
Missing data completion in data, first makees to mend 0 process to missing data, and then historical data is made in time domain according to time series
Pretreatment, then carry out frequency filtering, finally carry out best fit comparison using least square method, be finally fitted the song that obtains
The match value that correspondence point of observation is 0 is found out in line, is the completion value of actual missing data, so as to substitute into completion value by history number
Missing data completion according in;
Step 2:Using through pretreatment water quality parameter historical data in before 2/3 data as training sample set, after
1/3 data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit
The water quality parameter value of time is trained as output to RVM;Tested with the RVM after the training of test sample set pair, will be surveyed
The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before examination sample set, and observes the defeated of the RVM
Go out the predicted value at end, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches
In the case of requirement, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the time series predicting model returned based on RVM, Ji Jiangxin
Front some unit interval water quality parameter value send into forecast model input, then when its output end predicts next unit
Between water quality parameter value.
Preferably, the water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
Preferably, the water quality parameter time series predicting model in the step 3 based on RVM recurrence is as follows:For giving
The input for determining water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is x*, then the prediction of corresponding output is equal
Value y*And varianceRespectivelyPrediction output t*Obedience average is y*, variance
ForGaussian distribution, i.e.,Wherein μΤPosteriority weight average value is represented,It is noise
Variance, then prediction is output asReliability is set
For 1- θ, then t*Two-sided confidence interval can be obtained by following formula:That is p { y*-σ*
zθ/2< t*< y*+σ*zθ/2}=1- θ obtain t*Confidence level for 1- θ confidential interval be [y*-σ*zθ/2, y*+σ*zθ/2] on point position
Number zθ/2Checked in by standardized normal distribution table, seek 95% confidential interval.
Preferably, in the step 3 check error when using mean square error MSE (Mean Square Error,
MSE), coefficient R (Correlation Coefficient) is used as the index of valuation prediction models performance, its computing formula
Respectively:
The less estimated performance for representing model of mean square error is better, and the absolute value of coefficient correlation is closer to 1 explanation prediction
It is more accurate, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithGeneration respectively
N actual value average of the corresponding water quality parameter of table and predicted value average.
Preferably, different for the requirement that different water quality parameter errors reaches in the step 3:PH value is predicted
Model mean square error be less than 0.004, to dissolved oxygen content prediction model mean square error be less than 0.08, to permanganate
The mean square error of the model of exponential forecasting be less than 0.02, to ammonia-nitrogen content prediction model mean square error be less than 0.002, and on
The coefficient correlation for stating the model of prediction is all not less than 0.95.
Beneficial effect:
The present invention provides the water quality parameter time series forecasting side returned based on interconnection vector machine using above-mentioned technical proposal
Method, can carry out time series forecasting, and the scope of prediction is big, high precision, the good stability of prediction, and can be given probability
Output, while prediction is given, moreover it is possible to provide the confidential interval of prediction, reduces predicted time, and water quality parameter is observed in time
Change.
Description of the drawings
Fig. 1 is the schematic flow sheet of the present invention;
Fig. 2 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to pH value;
Fig. 3 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to dissolved oxygen content;
Fig. 4 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to permanganate index;
Fig. 5 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to ammonia-nitrogen content;
Fig. 6 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to pH value;
Fig. 7 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to dissolved oxygen content;
Fig. 8 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to permanganate index;
Fig. 9 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to ammonia-nitrogen content;
Figure 10 is that pH value adopts linear kernel function or gaussian kernel function, SVMs in the forecast model of the present invention
Forecast model is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 11 be dissolved oxygen content the present invention forecast model using linear kernel function or gaussian kernel function, support to
The forecast model of amount machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 12 is that permanganate index adopts linear kernel function or gaussian kernel function, support in the forecast model of the present invention
The forecast model of vector machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 13 is that ammonia-nitrogen content adopts linear kernel function or gaussian kernel function, supporting vector in the forecast model of the present invention
The forecast model of machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function.
Specific embodiment
As shown in figure 1, the water quality parameter Time Series Forecasting Methods returned based on interconnection vector machine, are comprised the following steps:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by history
Missing data completion in data, first makees to mend 0 process to missing data, and then historical data is made in time domain according to time series
Pretreatment, then carry out frequency filtering, finally carry out best fit comparison using least square method, be finally fitted the song that obtains
The match value that correspondence point of observation is 0 is found out in line, is the completion value of actual missing data, so as to substitute into completion value by history number
Missing data completion according in;
Step 2:Using through pretreatment water quality parameter historical data in before 2/3 data as training sample set, after
1/3 data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit
The water quality parameter value of time is trained as output to RVM;Tested with the RVM after the training of test sample set pair, will be surveyed
The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before examination sample set, and observes the defeated of the RVM
Go out the predicted value at end, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches
In the case of requirement, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the time series predicting model returned based on RVM, Ji Jiangxin
Front some unit interval water quality parameter value send into forecast model input, then when its output end predicts next unit
Between water quality parameter value.
The water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.It is based in the step 3
The water quality parameter time series predicting model that RVM is returned is as follows:For given water quality parameter pH value, dissolved oxygen content, permanganate index
Or the input of ammonia-nitrogen content is x*, then prediction average y for accordingly exporting*And varianceRespectively
Prediction output t*Obedience average is y*, variance beGaussian distribution, i.e.,Wherein μΤRepresent posteriority power
Value mean value,It is noise variance, then prediction is output as
Setting reliability is 1- θ, then t*Two-sided confidence interval can be obtained by following formula:
That is p { y*-σ*zθ/2< t*< y*+σ*zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y*-σ*zθ/2, y*+σ*
zθ/2], upper quantile zθ/2Checked in by standardized normal distribution table, seek 95% confidential interval.Miss in inspection in the step 3
Mean square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) are adopted during difference
Used as the index of valuation prediction models performance, its computing formula is respectively:
The less estimated performance for representing model of mean square error is better, and the absolute value of coefficient correlation is closer to 1 explanation prediction
It is more accurate, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithGeneration respectively
N actual value average of the corresponding water quality parameter of table and predicted value average.For different water quality parameter errors in the step 3
The requirement for reaching is different:To pH value prediction model mean square error be less than 0.004, to dissolved oxygen content prediction model it is equal
Square error is less than 0.08, is less than 0.02 to the mean square error of the model of permanganate index prediction, the mould to ammonia-nitrogen content prediction
The mean square error of type is less than 0.002, and the coefficient correlation of the model of above-mentioned prediction is all not less than 0.95.
The time series predicting model of water quality parameter is expressed as follows:
If time series isWherein N be sequence length, ynFor the water quality parameter monitor value at n moment, xn=
[yn-dτ, yn-(d-1) ..., yn-τ] be before d monitor value composition vector, here d be Embedded dimensions, τ is time delay, is deposited
In certain mapping relations:
yn=F (xn), n=1,2 ..., N
The accurate simulation that it is critical only that to F () that water quality parameter is predicted is realized, training sample set is built for thisWherein xn=[yn-dτ, yn-(d-1)τ..., yn-τ]TFor input sample, tn=ynTo export sample, using the instruction
Practice sample set to be trained interconnection vector machine, set up water quality parameter time series predicting model, wherein using d=4, τ is 1 week
Recurrence postpone, i.e., predict next weekly data with front 4 weekly data.
The derivation step of the water quality parameter time series predicting model returned based on RVM is as follows:
Step one:The training sample set of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen contentIt is 4 dimension input vectors, tnIt is output, it is assumed that be both independently distributed, and between them
Relation can be expressed as tn=y (xn;w)+εn, wherein εnIt is independent identically distributed Gaussian noise, and εn~N (0, σ2), i.e.,
tnObedience average is y (xn, w), variance is σ2Gaussian Profile;
Step 2:The output of forecast model is represented byK (x, xi) it is core
Function, kernel function is respectively adopted linear kernel function or gaussian kernel function, w=[ω0, ω1..., ωN]TFor model weights to
Amount, by εnMeet Gaussian distributions, target output value tnSeparate, then the likelihood function of whole training sample set isT=[t in formula1, t2..., tN]T, Φ=[φ
(x1), φ (x2) ..., φ (xN)]TFor the matrix of N × (N+1), φ (xn)=[1, K (xn, x1), K (xn, x2) ..., K (xn, xN)
]T;
Step 3:There is generalization to make model, using Bayesian frameworks, prior probability distribution is introduced:α is vectorial by the hyper parameter that N+1 hyper parameters are constituted in formula, then the posteriority of training sample set
Probability distribution can be tried to achieve by Bayesian formula reasonings:
The Posterior probability distribution of weight vectors ω isPosteriority
Variance and average are respectively
Step 4:By training sample set is carried out edge integration obtain p (t | α, σ2)=∫ p (t | w, σ2)·p(w|α)
DW, so as to obtain the marginal likelihood function of hyper parameter:P (t | α, σ2)=N (0, C), wherein C=σ2I+ΦA-1ΦT, hyper parameter α and
σ2The Posterior probability distribution of ω is directly affected, needs the maximum a posteriori probability optimized to it to obtain ω to be distributed, introduce delta letters
Number, is translated into hyper parameter Posterior probability distribution p (α, σ2| t) with regard to α and σ2Max problem, in consistent super prior probability
Only marginal likelihood function need to be maximized in the case of distribution;
Step 5:Arranged according to MacKay methods:Wherein μi
It is i-th element of mean vector μ, γ defined in MacKay methodsi=1- αiΣii, i-th yuan on the diagonal of variance Σ
Element, is updated by continuous iteration(σ2)new, until the gradient that all parameters all restrain i.e. output result is less than 10-3Or
Till when reaching maximum frequency of training 1000, hyper parameter α is obtained by maximum likelihood methodMPAnd noise variance
Step 6:If given water quality parameter value x of input*, then accordingly the probability distribution of output is:
Gaussian distributions are obeyed,
I.e.Wherein, predict that average and variance are respectively
t*Obedience average is y*, variance isGaussian distribution, then
Step 7:Setting reliability is 1- θ, then t*Two-sided confidence interval can be obtained by following formula:
That is p { y*-σ*zθ/2< t*< y*+σ*zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y*-σ*zθ/2, y*+σ*
zθ/2], upper quantile Zθ/2Can be checked in by standardized normal distribution table, seek 95% confidential interval.
According to the derivation of the time series predicting model returned based on RVM, it is determined that the time series returned based on RVM
The concrete operation step of forecast model is as follows:
(1) training sample set of water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is determined;
(2) kernel function is selected, and determines kernel function width gamma and noise variance σ2;
(3) α and σ is initialized2;
(4) the posterior variance Σ and mean μ of weight vectors ω are calculated;
(5) update(σ2)new;
(6) circulation step (4) and step (5), until the gradient of maximum iteration time 1000 or output result is less than 10-3;
(7) delete in hyper parameter α and be more than or equal to αmax(take e9) corresponding to weight coefficient and basic function, implementation model
Rarefaction;
(8) to the test set of water quality parameter, the hyper parameter α obtained by trainingMPAnd noise varianceIt is predicted and estimates
Meter.
Carry out labor below by two groups of experimental results, when the present invention carries out experiment test, data source is in middle Chinese
People republic Environmental Protection Department (http://www.mep.gov.cn/) announce national main river emphasis section Sichuan climb branch
Flower dragon's cave-stalactite cave automatic water quality monitoring weekly (the 1st week 2004 to the 53rd week 2012).Select 2004 of Sichuan's dragon's cave-stalactite cave
As training dataset, the 1st week 2010 to the 53rd week 2012 used as test data set within 1st week to the 52nd week 2009.With
Front 4 weekly data predicts next weekly data, then training dataset is 308 groups, and test data set is 153 groups.
Experiment 1:In view of the selection of kernel function has a certain degree of impact to modeling effect, in the present invention, returned based on RVM
When returning modeling, linear kernel function and gaussian kernel function are have chosen respectively, will pass through experimental result to each water quality parameter choosing
Select most suitable kernel function.
Fig. 2~9 are given the modeling algorithm returned based on RVM and are choosing linear kernel function and gaussian kernel function respectively to each
The time series forecasting result figure of water quality parameter, qualitatively gives using result during different kernel functions.
As can be seen that prediction of the time series predicting model of linear kernel function RVM recurrence to pH value is imitated from Fig. 2~5
Fruit preferably, although slightly weaker to the prediction effect of dissolved oxygen, permanganate index and ammonia nitrogen, but still is subjected to.By Fig. 6~
9 can be seen that, although predicted value and original value of the time series predicting model that gaussian kernel function RVM is returned to four kinds of water quality parameters
Equal energy quite well, but the prediction effect to dissolved oxygen and permanganate index is substantially better than the prediction effect to pH value and ammonia nitrogen
Really.
According to above-mentioned experiment, the time series predicting model that gaussian kernel function and linear kernel function RVM are returned is contrasted respectively
Four kinds of predicting the outcome for water quality parameter are understood, prediction of the gaussian kernel function RVM regression time series forecast model to pH value is imitated
Fruit is good without linear kernel function RVM regression time series forecast model, but linear kernel function RVM regression time series forecast model
Predicting the outcome but without gaussian kernel function RVM regression time series forecast models to dissolved oxygen, permanganate index and ammonia nitrogen
It is good.
Because RVM regression models are when predicted value is provided, confidential interval can be also simultaneously obtained, therefore can be predicted the outcome
Credibility, so as to provide more reference informations for Water quality monitoring and management mechanism.Herein except being given in Fig. 2~9
Outside the predicted value and original value of water quality parameter, the confidential interval of statistically the most frequently used confidence level 95% is given.Due to
Water quality parameter value actual value is all higher than zero, so the confidential interval needs of water quality parameter confidence level 95% are in original confidence level
Minus part is removed in 95% confidential interval.By Fig. 2~9 as can be seen that RVM regression forecasting time series models are to four
Planting water quality parameter can obtain good prediction effect, and water quality parameter original value is all fallen within confidential interval.If in addition, actual prison
Measured value (i.e. original value) away from predicted value, and beyond confidential interval, then it is assumed that be likely to occur the accident of water pollution, must
Early warning information can be sent when wanting, the reason for point out regulator further to check change of water quality.
Experiment 2:In order to further illustrate problem, below by the present invention RVM return time series modeling algorithm with it is normal
The time series modeling algorithm that the SVM for seeing is returned is made comparisons.Specifically from coefficient R, mean square error MSE, predicted time and pass
(corresponding with interconnection vector in SVM is supporting vector (Support for connection vector number nRV or supporting vector number nSV
Vector)) this four aspects are compared.
Table 1~4 gives time series forecasting of the RVM respectively and SVM recurrence to each water quality parameter and compares.
The time series forecasting results contrast of table 1PH values
The time series forecasting results contrast of the dissolved oxygen of table 2
The time series forecasting results contrast of the permanganate index of table 3
The time series forecasting results contrast of the ammonia nitrogen of table 4
The time series forecasting result that RVM in contrast table 1 and SVM is returned understands, for pH value, according to same
Kernel function, the coefficient correlation of RVM time series predicting models is significantly greater than SVM time series predicting models, mean square error, pre-
Survey time and interconnection vector (or supporting vector) number are all considerably less than SVM time series predicting models.And contrast linear kernel letter
Number and gaussian kernel function understand that the prediction effect of linear kernel function is better than gaussian kernel function.As can be seen that SVM from table 2 and 3
Regression time series forecast model is better than RVM, coefficient correlation difference to dissolved oxygen and permanganate index in mean square error MSE
Less, but supporting vector number is but tens times of RVM regression time series forecast model interconnection vector numbers even hundreds of times.Always
For two kinds of time series predicting models the prediction effect of dissolved oxygen and permanganate index is more or less the same.By can in table 4
Know, when using gaussian kernel function, RVM regression time series forecast model to the prediction of ammonia nitrogen mean square error, coefficient correlation,
Run time and interconnection vector are all good than what SVM was returned, when using linear kernel function, RVM regression time series prediction mould
Type other indexs in addition to mean square error are better all than SVM.
Predicting the outcome for consolidated statement 1~4 can find that RVM has good generalization ability, two kinds of time serieses as SVM
Forecast model is obtained good predicting the outcome.And the coefficient correlation of RVM time series predicting models is typically greater than SVM
, interconnection vector number is far less than SVM supporting vector numbers, and predicted time is shorter than SVM.
More fully to compare the performance of RVM time series predicting models and SVM time series predicting models, four are depicted
The relative error curve map of individual water quality parameter time series forecasting (but to obtain clearly comparison diagram, only draws test data set
In front 50 groups of data each forecast model relative error curve map) as shown in Figure 10~13.
As can be seen from Figure 10, for pH value is predicted, the relative of linear kernel function RVM regression time series forecast model is missed
Difference is minimum, and the RVM and SVM regression time series forecast model of gaussian kernel function is larger in the relative error of indivedual points, without line
The effect of property kernel function SVM regression time series forecast model is good.Knowable to observation Figure 11, gaussian kernel function RVM regression time sequences
Row forecast model is minimum to the relative error of dissolved oxygen prediction, and gaussian kernel function SVM regression time series forecast models many places go out
Existing relative error is more a little bigger, and the relative error of the seed nucleus versus time sequential forecasting models of RVM two is more or less the same, but RVM on the whole
Time series predicting model is more stable than SVM time series predicting model and more a little bigger without error.As can be seen from Figure 12 RVM with
SVM regression time series forecast model is more or less the same to the relative error that permanganate index is predicted, but not to pH value and molten
The prediction effect of solution oxygen is good.Gaussian kernel function and the prediction of linear kernel function RVM regression time series are clear that from Figure 13
The relative error that model is predicted ammonia nitrogen is all respectively than gaussian kernel function and linear kernel function SVM regression time series forecast model
It is little, and gaussian kernel function SVM time series predicting models relative error occur in many places more a little bigger, have a strong impact on prediction effect
Really.Comprehensive Figure 10~13 can be seen that RVM time series predicting models are better than SVM time series predicting models, and the relative of RVM is missed
Difference is less, and performance is more stable, and can provide the probabilistic information of prediction.
Have that supporting vector number is more for SVM water quality parameter time series predicting models, predicted time is long, without probability defeated
The problems such as going out, set forth herein being returned the method for setting up water quality parameter time series predicting model using RVM, and is selected respectively linear
The RVM regression models of kernel function model and gaussian kernel function are predicted, and can be seen that original value all in confidence from predicting the outcome
In the confidential interval of degree 95%.Carried out by returning water quality parameter time series forecast model with the SVM using corresponding kernel function
Relatively understand, the precision of prediction of RVM models is not less than on the whole SVM models.Because RVM be given be prediction probability distribution,
So while prediction is given, moreover it is possible to provide the confidential interval of prediction, so as to provide more for Water quality monitoring and management mechanism
Reference information.Additionally, RVM regression models have it is very strong openness, with interconnection vector number it is few, predicted time is short, extensive
The advantages of ability is strong.
Specific embodiment described herein is only explanation for example spiritual to the present invention.Technology neck belonging to of the invention
The technical staff in domain can be made various modifications to described specific embodiment or supplement or replaced using similar mode
Generation, but without departing from the spiritual of the present invention or surmount scope defined in appended claims.
Claims (5)
1. it is a kind of based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, it is characterised in that:Including following step
Suddenly:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by historical data
In missing data completion, first to missing data make mend 0 process, then pre- in time domain is made according to time series to historical data
Process, then carry out frequency filtering, finally best fit comparison is carried out using least square method, in the curve for obtaining finally is fitted
The match value that correspondence point of observation is 0 is found out, is the completion value of actual missing data, so as to substitute into completion value by historical data
Missing data completion;
Step 2:Using before in the water quality parameter historical data of pretreatment 2/3 data as training sample set, afterwards 1/3
Data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit interval
Water quality parameter value as output, RVM is trained;Tested with the RVM after the training of test sample set pair, by test specimens
The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before this collection, and observes the output end of the RVM
Predicted value, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches requirement
In the case of, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the water quality parameter time series predicting model returned based on RVM, i.e.,
The water quality parameter value of some unit interval sends into the input of forecast model before will be new, then predict next list in its output end
The water quality parameter value of position time.
2. it is according to claim 1 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature
It is:The water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
3. it is according to claim 2 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature
It is:Water quality parameter time series predicting model in the step 3 based on RVM recurrence is as follows:For given water quality parameter PH
The input of value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is x*, then prediction average y for accordingly exporting*And variance
RespectivelyPrediction output t*Obedience average is y*, variance be's
Gaussian is distributed, i.e.,Wherein μTPosteriority weight average value is represented,It is noise variance, then in advance
Survey is output asSetting reliability is 1- θ, then t*
Two-sided confidence interval can be obtained by following formula:That is p { y*-σ*zθ/2<t*<y*+σ*
zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y*-σ*zθ/2,y*+σ*zθ/2], upper quantile zθ/2By mark
Quasi normal distribution table is checked in, and seeks 95% confidential interval.
4. it is according to claim 1 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature
It is:When error is checked using mean square error MSE (Mean Square Error, MSE), coefficient R in the step 3
(Correlation Coefficient), used as the index of valuation prediction models performance, its computing formula is respectively:
The less estimated performance for representing model of mean square error is better, and it is more accurate that the absolute value of coefficient correlation is closer to 1 explanation prediction
Really, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithRepresent respectively corresponding
N actual value average of water quality parameter and predicted value average.
5. it is according to claim 4 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature
It is:It is different for the requirement that different water quality parameter errors reaches in the step 3:To pH value prediction model it is square
Error is less than 0.004, is less than 0.08 to the mean square error of the model of dissolved oxygen content prediction, the mould to permanganate index prediction
The mean square error of type is less than 0.02, and to the mean square error of the model of ammonia-nitrogen content prediction 0.002, and the model of above-mentioned prediction are less than
Coefficient correlation be all not less than 0.95.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410196457.4A CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410196457.4A CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103942457A CN103942457A (en) | 2014-07-23 |
CN103942457B true CN103942457B (en) | 2017-04-12 |
Family
ID=51190125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410196457.4A Active CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103942457B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318325B (en) * | 2014-10-14 | 2017-11-07 | 广东省环境监测中心 | Many basin real-time intelligent water quality prediction methods and system |
CN105676670B (en) * | 2014-11-18 | 2019-07-19 | 北京翼虎能源科技有限公司 | For handling the method and system of multi-energy data |
CN106156260B (en) * | 2015-04-28 | 2020-01-21 | 阿里巴巴集团控股有限公司 | Method and device for repairing missing data |
CN107977724A (en) * | 2016-10-21 | 2018-05-01 | 复凌科技(上海)有限公司 | A kind of water quality hard measurement Forecasting Methodology of permanganate index |
CN107153874B (en) * | 2017-04-11 | 2019-12-20 | 中国农业大学 | Water quality prediction method and system |
CN107392786B (en) * | 2017-07-11 | 2021-04-16 | 中国矿业大学 | Missing data compensation method for mine fiber bragg grating monitoring system based on support vector machine |
CN107480028B (en) * | 2017-07-21 | 2020-09-18 | 东软集团股份有限公司 | Method and device for acquiring usable residual time of disk |
CN107688871B (en) * | 2017-08-18 | 2020-08-21 | 中国农业大学 | Water quality prediction method and device |
CN109241607B (en) * | 2017-09-27 | 2023-05-30 | 山东农业大学 | Proportioning variable fertilization discrete element model parameter calibration method based on correlation vector machine |
CN109669017B (en) * | 2017-10-17 | 2021-04-27 | 中国石油化工股份有限公司 | Refinery distillation tower top cut water ion concentration prediction method based on deep learning |
CN108334977B (en) * | 2017-12-28 | 2020-06-30 | 鲁东大学 | Deep learning-based water quality prediction method and system |
JP7140410B2 (en) * | 2018-03-30 | 2022-09-21 | Necソリューションイノベータ株式会社 | Forecasting system, forecasting method and forecasting program |
CN108764520B (en) * | 2018-04-11 | 2021-11-16 | 杭州电子科技大学 | Water quality parameter prediction method based on multilayer cyclic neural network and D-S evidence theory |
CN108595892A (en) * | 2018-05-11 | 2018-09-28 | 南京林业大学 | Soft-measuring modeling method based on time difference model |
US10521701B2 (en) * | 2018-05-18 | 2019-12-31 | Google Llc | Parallel decoding using autoregressive machine learning models |
CN108710974B (en) * | 2018-05-18 | 2020-09-11 | 中国农业大学 | Water ammonia nitrogen prediction method and device based on deep belief network |
CN108846423A (en) * | 2018-05-29 | 2018-11-20 | 中国农业大学 | Water quality prediction method and system |
CN109165247B (en) * | 2018-09-30 | 2021-07-23 | 中冶华天工程技术有限公司 | Intelligent pretreatment method for sewage measurement data |
CN109784528A (en) * | 2018-12-05 | 2019-05-21 | 鲁东大学 | Water quality prediction method and device based on time series and support vector regression |
CN110182871A (en) * | 2019-07-10 | 2019-08-30 | 银天远创(厦门)科技有限公司 | A kind of method for treating water and terminal based on full-automatic medicine system |
CN110245881A (en) * | 2019-07-16 | 2019-09-17 | 重庆邮电大学 | A kind of water quality prediction method and system of the sewage treatment based on machine learning |
CN112182830B (en) * | 2019-08-06 | 2022-10-18 | 长春工业大学 | Water quality parameter prediction method |
JP6999137B2 (en) * | 2019-08-20 | 2022-01-18 | 株式会社カサイ | Water quality management equipment and methods for aquaculture ponds |
CN110889085A (en) * | 2019-09-30 | 2020-03-17 | 华南师范大学 | Intelligent wastewater monitoring method and system based on complex network multiple online regression |
CN110838344B (en) * | 2019-11-08 | 2023-04-07 | 北京理工大学 | Water quality data analysis method |
CN111080502B (en) * | 2019-12-17 | 2023-09-08 | 清华苏州环境创新研究院 | Big data identification method for regional enterprise data abnormal behaviors |
CN112036082B (en) * | 2020-08-27 | 2022-03-08 | 东北大学秦皇岛分校 | Time series data prediction method based on attention mechanism |
CN112489402A (en) * | 2020-11-27 | 2021-03-12 | 罗普特科技集团股份有限公司 | Early warning method, device and system for pipe gallery and storage medium |
CN113281478A (en) * | 2021-04-20 | 2021-08-20 | 广州珠水生态环境技术有限公司 | Water quality acid-base nature of water resource environmental protection restores and uses monitoring system |
CN113449789B (en) * | 2021-06-24 | 2024-05-03 | 北京市生态环境监测中心 | Quality control method for monitoring water quality by full spectrum water quality monitoring equipment based on big data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102968573A (en) * | 2012-12-14 | 2013-03-13 | 哈尔滨工业大学 | Online lithium ion battery residual life predicting method based on relevance vector regression |
CN103020642A (en) * | 2012-10-08 | 2013-04-03 | 江苏省环境监测中心 | Water environment monitoring and quality-control data analysis method |
CN103235096A (en) * | 2013-04-16 | 2013-08-07 | 广州铁路职业技术学院 | Sewage water quality detection method and apparatus |
-
2014
- 2014-05-09 CN CN201410196457.4A patent/CN103942457B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020642A (en) * | 2012-10-08 | 2013-04-03 | 江苏省环境监测中心 | Water environment monitoring and quality-control data analysis method |
CN102968573A (en) * | 2012-12-14 | 2013-03-13 | 哈尔滨工业大学 | Online lithium ion battery residual life predicting method based on relevance vector regression |
CN103235096A (en) * | 2013-04-16 | 2013-08-07 | 广州铁路职业技术学院 | Sewage water quality detection method and apparatus |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
Also Published As
Publication number | Publication date |
---|---|
CN103942457A (en) | 2014-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942457B (en) | Water quality parameter time series prediction method based on relevance vector machine regression | |
CN103136539B (en) | Ground net corrosion speed grade Forecasting Methodology | |
Cui et al. | Deep learning-based time-varying parameter identification for system-wide load modeling | |
CN105389980B (en) | Short-time Traffic Flow Forecasting Methods based on long short-term memory recurrent neural network | |
Shamshad et al. | First and second order Markov chain models for synthetic generation of wind speed time series | |
CN109934337A (en) | A kind of detection method of the spacecraft telemetry exception based on integrated LSTM | |
CN108009674A (en) | Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks | |
CN110472846A (en) | Nuclear power plant's thermal-hydraulic safety analysis the best-estimated adds uncertain method | |
CN103942461A (en) | Water quality parameter prediction method based on online sequential extreme learning machine | |
CN102542169B (en) | Linear selecting method in computing process of hydrological frequency | |
CN112001110B (en) | Structural damage identification monitoring method based on vibration signal space real-time recurrent graph convolutional neural network | |
CN110501646A (en) | Off-line lithium battery residual capacity estimation method | |
CN113762486B (en) | Method and device for constructing fault diagnosis model of converter valve and computer equipment | |
CN108921279A (en) | Reservoir day enters water prediction technique | |
Zhang et al. | An anomaly identification model for wind turbine state parameters | |
CN106203723A (en) | Wind power short-term interval prediction method based on RT reconstruct EEMD RVM built-up pattern | |
CN103268279B (en) | Based on the software reliability prediction method of compound poisson process | |
CN111767517A (en) | BiGRU multi-step prediction method and system applied to flood prediction and storage medium | |
CN105930670A (en) | Model parameter uncertainty-based dynamic prediction method for river emergency pollution accident | |
CN106295121A (en) | Landscape impoundments Bayes's water quality grade Forecasting Methodology | |
CN103018063A (en) | Bridge random fatigue life prediction method based on Mittag-Leffler distribution | |
CN109034225A (en) | A kind of combination stochastic variable ash and the modified uncertain parameters estimation method of Bayesian model | |
CN112307536A (en) | Dam seepage parameter inversion method | |
Katipoğlu | Monthly stream flows estimation in the Karasu river of Euphrates basin with artificial neural networks approach | |
CN107194507A (en) | A kind of short-term wind speed forecasting method of wind farm based on combination SVMs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |