CN103942457B - Water quality parameter time series prediction method based on relevance vector machine regression - Google Patents

Water quality parameter time series prediction method based on relevance vector machine regression Download PDF

Info

Publication number
CN103942457B
CN103942457B CN201410196457.4A CN201410196457A CN103942457B CN 103942457 B CN103942457 B CN 103942457B CN 201410196457 A CN201410196457 A CN 201410196457A CN 103942457 B CN103942457 B CN 103942457B
Authority
CN
China
Prior art keywords
water quality
quality parameter
time series
value
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410196457.4A
Other languages
Chinese (zh)
Other versions
CN103942457A (en
Inventor
汪晓东
笪英云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Normal University CJNU
Original Assignee
Zhejiang Normal University CJNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Normal University CJNU filed Critical Zhejiang Normal University CJNU
Priority to CN201410196457.4A priority Critical patent/CN103942457B/en
Publication of CN103942457A publication Critical patent/CN103942457A/en
Application granted granted Critical
Publication of CN103942457B publication Critical patent/CN103942457B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a water quality parameter time series prediction method based on relevance vector machine regression. The water quality parameter time series prediction method comprises the following steps of 1 acquiring water quality parameter historical data from an automatic water quality monitoring station and performing data pre-processing; 2 using front 2/3 data in the pro-processed water quality parameter historical data as a training sample set and using rear 1/3 data as a testing sample set; 3 using the training sample set to train an RVM, using the testing sample set to test the trained RVM so as to obtain a water quality parameter time series prediction model based on the RVM regression; 4 using the water quality parameter time series prediction model based on the RVM regression to predict new water quality parameters. The water quality parameter time series prediction method can perform time series prediction, is large in prediction range, high in accuracy and good in prediction stability, and can provide probabilistic output, give a predicted confidence interval while performing prediction, reduce the prediction time and timely observe water quality parameter change.

Description

Based on the water quality parameter Time Series Forecasting Methods that interconnection vector machine is returned
Technical field
The present invention relates to water quality monitoring field, and in particular to the water quality parameter time series returned based on interconnection vector machine is pre- Survey method.
Background technology
Water quality parameter time series is an orderly Monitoring Data sequence, and it embodies certain water quality parameter in time Distribution situation, such as Monitoring Data of certain basin section in the water quality parameter pH value of the 1st week to the 50th week certain year.During water quality parameter Between sequence prediction method be then the inherent statistics of the historical data in analysis set using acquired historical time arrangement set Characteristic and the rule of development are learned, water quality parameter time series predicting model is set up, and obtains prediction data to show using the model The development trend of Future Data.Water quality parameter time series forecasting, is the element task of water environment management and Environmental capacity.At present Information and technical support of China's water pollution accident due to shortage early stage, is mostly to count afterwards, it is impossible to predict the change of water quality Change and avoid the generation of contamination accident.Therefore, it is water environment in recent years to set up reliable water quality parameter time series predicting model One of study hotspot of scientific domain.At present water quality parameter time series predicting model common both at home and abroad is mainly artificial neuron Network and SVMs (Support Vector Machine, SVM) regression time series forecast model, but ANN Network algorithm easily occurred learning or owe study, local minimum, network structure be difficult to determine, the problems such as Generalization Ability difference, and SVM regression models are a kind of supervised study side of foundation on the basis of Statistical Learning Theory and structural risk minimization Method, the method will be originally inputted the high-dimensional feature space for being mapped to linear separability by kernel function, with generalization ability it is strong, be difficult The advantages of generation over-fitting, the problems such as can preferably solve small sample, non-linear, high dimension drawn game portion minimal point, with artificial god Jing network time sequential forecasting models are compared, and are increased based on the time series predicting model performance of SVM, but in the SVM times In sequential forecasting models, kernel function must is fulfilled for Mercer conditions, and the number of supporting vector can be in the increase of training sample It is linearly increasing, and be only given it is deterministic predict the outcome, without probability output, it is impossible to estimate the uncertainty of prediction, and in reality The prediction of probabilistic type in the application of border can provide important information, aid in determining whether the confidence level that water quality parameter is predicted.Associate to Amount machine (Relevance Vector Machine, RVM) is that Tipping is proposed in calendar year 2001 on the basis of Bayesian frame A kind of newer machine learning algorithm, its kernel function need not meet Mercer conditions, and the openness of solution is also far above SVM, and energy The probabilistic information of prediction is enough provided, with preferable generalization ability, RVM is solving many realities such as pattern-recognition and regression estimates Applied in the problem of border, and achieved good effect.The Chinese patent of Application No. 20131013190.7 provides one Sewage quality monitoring method and device are planted, the forecast model that the method is adopted is based on the flexible measurement method of interconnection vector machine, phase Than the model set up using neutral net and model construction of SVM method, the essence of the prediction with preferably applicability and Geng Gao Degree, but the method has following defect:One:It is analyzed and then obtains water outlet total nitrogen or water outlet due to uses relevant parameter The content of total phosphorus, the uncertainty of related data understands the data result of its final output of extreme influence, although data output is compared The model set up with neutral net and model construction of SVM method is greatly improved, but the data of its final output As a result unstability is still present;Two:Simply merely analyze the content of water outlet total nitrogen or water outlet total phosphorus at that time, it is impossible to realize Time series forecasting, the scope of prediction is little, and precision is low.
The content of the invention
The technical problem to be solved is to provide the water quality parameter time series returned based on interconnection vector machine Forecasting Methodology, can carry out time series forecasting, and the scope of prediction is big, high precision, the good stability of prediction, and can be given Probability output, while prediction is provided, provides the confidential interval of prediction, reduces predicted time, and water quality parameter is observed in time Change.
To solve above-mentioned existing technical problem, the present invention adopts following scheme:Based on the water quality that interconnection vector machine is returned Parameter time series Forecasting Methodology, comprises the following steps:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by history Missing data completion in data, first makees to mend 0 process to missing data, and then historical data is made in time domain according to time series Pretreatment, then carry out frequency filtering, finally carry out best fit comparison using least square method, be finally fitted the song that obtains The match value that correspondence point of observation is 0 is found out in line, is the completion value of actual missing data, so as to substitute into completion value by history number Missing data completion according in;
Step 2:Using through pretreatment water quality parameter historical data in before 2/3 data as training sample set, after 1/3 data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit The water quality parameter value of time is trained as output to RVM;Tested with the RVM after the training of test sample set pair, will be surveyed The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before examination sample set, and observes the defeated of the RVM Go out the predicted value at end, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches In the case of requirement, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the time series predicting model returned based on RVM, Ji Jiangxin Front some unit interval water quality parameter value send into forecast model input, then when its output end predicts next unit Between water quality parameter value.
Preferably, the water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
Preferably, the water quality parameter time series predicting model in the step 3 based on RVM recurrence is as follows:For giving The input for determining water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is x*, then the prediction of corresponding output is equal Value y*And varianceRespectivelyPrediction output t*Obedience average is y*, variance ForGaussian distribution, i.e.,Wherein μΤPosteriority weight average value is represented,It is noise Variance, then prediction is output asReliability is set For 1- θ, then t*Two-sided confidence interval can be obtained by following formula:That is p { y** zθ/2< t*< y**zθ/2}=1- θ obtain t*Confidence level for 1- θ confidential interval be [y*-σ*zθ/2, y**zθ/2] on point position Number zθ/2Checked in by standardized normal distribution table, seek 95% confidential interval.
Preferably, in the step 3 check error when using mean square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) is used as the index of valuation prediction models performance, its computing formula Respectively:
The less estimated performance for representing model of mean square error is better, and the absolute value of coefficient correlation is closer to 1 explanation prediction It is more accurate, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithGeneration respectively N actual value average of the corresponding water quality parameter of table and predicted value average.
Preferably, different for the requirement that different water quality parameter errors reaches in the step 3:PH value is predicted Model mean square error be less than 0.004, to dissolved oxygen content prediction model mean square error be less than 0.08, to permanganate The mean square error of the model of exponential forecasting be less than 0.02, to ammonia-nitrogen content prediction model mean square error be less than 0.002, and on The coefficient correlation for stating the model of prediction is all not less than 0.95.
Beneficial effect:
The present invention provides the water quality parameter time series forecasting side returned based on interconnection vector machine using above-mentioned technical proposal Method, can carry out time series forecasting, and the scope of prediction is big, high precision, the good stability of prediction, and can be given probability Output, while prediction is given, moreover it is possible to provide the confidential interval of prediction, reduces predicted time, and water quality parameter is observed in time Change.
Description of the drawings
Fig. 1 is the schematic flow sheet of the present invention;
Fig. 2 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to pH value;
Fig. 3 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to dissolved oxygen content;
Fig. 4 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to permanganate index;
Fig. 5 is that the forecast model of the present invention adopts time series forecasting result of the linear kernel function to ammonia-nitrogen content;
Fig. 6 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to pH value;
Fig. 7 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to dissolved oxygen content;
Fig. 8 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to permanganate index;
Fig. 9 is that the forecast model of the present invention adopts time series forecasting result of the gaussian kernel function to ammonia-nitrogen content;
Figure 10 is that pH value adopts linear kernel function or gaussian kernel function, SVMs in the forecast model of the present invention Forecast model is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 11 be dissolved oxygen content the present invention forecast model using linear kernel function or gaussian kernel function, support to The forecast model of amount machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 12 is that permanganate index adopts linear kernel function or gaussian kernel function, support in the forecast model of the present invention The forecast model of vector machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function;
Figure 13 is that ammonia-nitrogen content adopts linear kernel function or gaussian kernel function, supporting vector in the forecast model of the present invention The forecast model of machine is using the relative error curve map in the case of linear kernel function or gaussian kernel function.
Specific embodiment
As shown in figure 1, the water quality parameter Time Series Forecasting Methods returned based on interconnection vector machine, are comprised the following steps:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by history Missing data completion in data, first makees to mend 0 process to missing data, and then historical data is made in time domain according to time series Pretreatment, then carry out frequency filtering, finally carry out best fit comparison using least square method, be finally fitted the song that obtains The match value that correspondence point of observation is 0 is found out in line, is the completion value of actual missing data, so as to substitute into completion value by history number Missing data completion according in;
Step 2:Using through pretreatment water quality parameter historical data in before 2/3 data as training sample set, after 1/3 data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit The water quality parameter value of time is trained as output to RVM;Tested with the RVM after the training of test sample set pair, will be surveyed The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before examination sample set, and observes the defeated of the RVM Go out the predicted value at end, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches In the case of requirement, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the time series predicting model returned based on RVM, Ji Jiangxin Front some unit interval water quality parameter value send into forecast model input, then when its output end predicts next unit Between water quality parameter value.
The water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.It is based in the step 3 The water quality parameter time series predicting model that RVM is returned is as follows:For given water quality parameter pH value, dissolved oxygen content, permanganate index Or the input of ammonia-nitrogen content is x*, then prediction average y for accordingly exporting*And varianceRespectively Prediction output t*Obedience average is y*, variance beGaussian distribution, i.e.,Wherein μΤRepresent posteriority power Value mean value,It is noise variance, then prediction is output as Setting reliability is 1- θ, then t*Two-sided confidence interval can be obtained by following formula: That is p { y**zθ/2< t*< y**zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y**zθ/2, y** zθ/2], upper quantile zθ/2Checked in by standardized normal distribution table, seek 95% confidential interval.Miss in inspection in the step 3 Mean square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) are adopted during difference Used as the index of valuation prediction models performance, its computing formula is respectively:
The less estimated performance for representing model of mean square error is better, and the absolute value of coefficient correlation is closer to 1 explanation prediction It is more accurate, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithGeneration respectively N actual value average of the corresponding water quality parameter of table and predicted value average.For different water quality parameter errors in the step 3 The requirement for reaching is different:To pH value prediction model mean square error be less than 0.004, to dissolved oxygen content prediction model it is equal Square error is less than 0.08, is less than 0.02 to the mean square error of the model of permanganate index prediction, the mould to ammonia-nitrogen content prediction The mean square error of type is less than 0.002, and the coefficient correlation of the model of above-mentioned prediction is all not less than 0.95.
The time series predicting model of water quality parameter is expressed as follows:
If time series isWherein N be sequence length, ynFor the water quality parameter monitor value at n moment, xn= [yn-dτ, yn-(d-1) ..., yn-τ] be before d monitor value composition vector, here d be Embedded dimensions, τ is time delay, is deposited In certain mapping relations:
yn=F (xn), n=1,2 ..., N
The accurate simulation that it is critical only that to F () that water quality parameter is predicted is realized, training sample set is built for thisWherein xn=[yn-dτ, yn-(d-1)τ..., yn-τ]TFor input sample, tn=ynTo export sample, using the instruction Practice sample set to be trained interconnection vector machine, set up water quality parameter time series predicting model, wherein using d=4, τ is 1 week Recurrence postpone, i.e., predict next weekly data with front 4 weekly data.
The derivation step of the water quality parameter time series predicting model returned based on RVM is as follows:
Step one:The training sample set of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen contentIt is 4 dimension input vectors, tnIt is output, it is assumed that be both independently distributed, and between them Relation can be expressed as tn=y (xn;w)+εn, wherein εnIt is independent identically distributed Gaussian noise, and εn~N (0, σ2), i.e., tnObedience average is y (xn, w), variance is σ2Gaussian Profile;
Step 2:The output of forecast model is represented byK (x, xi) it is core Function, kernel function is respectively adopted linear kernel function or gaussian kernel function, w=[ω0, ω1..., ωN]TFor model weights to Amount, by εnMeet Gaussian distributions, target output value tnSeparate, then the likelihood function of whole training sample set isT=[t in formula1, t2..., tN]T, Φ=[φ (x1), φ (x2) ..., φ (xN)]TFor the matrix of N × (N+1), φ (xn)=[1, K (xn, x1), K (xn, x2) ..., K (xn, xN) ]T;
Step 3:There is generalization to make model, using Bayesian frameworks, prior probability distribution is introduced:α is vectorial by the hyper parameter that N+1 hyper parameters are constituted in formula, then the posteriority of training sample set Probability distribution can be tried to achieve by Bayesian formula reasonings: The Posterior probability distribution of weight vectors ω isPosteriority Variance and average are respectively
Step 4:By training sample set is carried out edge integration obtain p (t | α, σ2)=∫ p (t | w, σ2)·p(w|α) DW, so as to obtain the marginal likelihood function of hyper parameter:P (t | α, σ2)=N (0, C), wherein C=σ2I+ΦA-1ΦT, hyper parameter α and σ2The Posterior probability distribution of ω is directly affected, needs the maximum a posteriori probability optimized to it to obtain ω to be distributed, introduce delta letters Number, is translated into hyper parameter Posterior probability distribution p (α, σ2| t) with regard to α and σ2Max problem, in consistent super prior probability Only marginal likelihood function need to be maximized in the case of distribution;
Step 5:Arranged according to MacKay methods:Wherein μi It is i-th element of mean vector μ, γ defined in MacKay methodsi=1- αiΣii, i-th yuan on the diagonal of variance Σ Element, is updated by continuous iteration2)new, until the gradient that all parameters all restrain i.e. output result is less than 10-3Or Till when reaching maximum frequency of training 1000, hyper parameter α is obtained by maximum likelihood methodMPAnd noise variance
Step 6:If given water quality parameter value x of input*, then accordingly the probability distribution of output is:
Gaussian distributions are obeyed, I.e.Wherein, predict that average and variance are respectively t*Obedience average is y*, variance isGaussian distribution, then
Step 7:Setting reliability is 1- θ, then t*Two-sided confidence interval can be obtained by following formula: That is p { y**zθ/2< t*< y**zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y**zθ/2, y** zθ/2], upper quantile Zθ/2Can be checked in by standardized normal distribution table, seek 95% confidential interval.
According to the derivation of the time series predicting model returned based on RVM, it is determined that the time series returned based on RVM The concrete operation step of forecast model is as follows:
(1) training sample set of water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is determined;
(2) kernel function is selected, and determines kernel function width gamma and noise variance σ2
(3) α and σ is initialized2
(4) the posterior variance Σ and mean μ of weight vectors ω are calculated;
(5) update2)new
(6) circulation step (4) and step (5), until the gradient of maximum iteration time 1000 or output result is less than 10-3
(7) delete in hyper parameter α and be more than or equal to αmax(take e9) corresponding to weight coefficient and basic function, implementation model Rarefaction;
(8) to the test set of water quality parameter, the hyper parameter α obtained by trainingMPAnd noise varianceIt is predicted and estimates Meter.
Carry out labor below by two groups of experimental results, when the present invention carries out experiment test, data source is in middle Chinese People republic Environmental Protection Department (http://www.mep.gov.cn/) announce national main river emphasis section Sichuan climb branch Flower dragon's cave-stalactite cave automatic water quality monitoring weekly (the 1st week 2004 to the 53rd week 2012).Select 2004 of Sichuan's dragon's cave-stalactite cave As training dataset, the 1st week 2010 to the 53rd week 2012 used as test data set within 1st week to the 52nd week 2009.With Front 4 weekly data predicts next weekly data, then training dataset is 308 groups, and test data set is 153 groups.
Experiment 1:In view of the selection of kernel function has a certain degree of impact to modeling effect, in the present invention, returned based on RVM When returning modeling, linear kernel function and gaussian kernel function are have chosen respectively, will pass through experimental result to each water quality parameter choosing Select most suitable kernel function.
Fig. 2~9 are given the modeling algorithm returned based on RVM and are choosing linear kernel function and gaussian kernel function respectively to each The time series forecasting result figure of water quality parameter, qualitatively gives using result during different kernel functions.
As can be seen that prediction of the time series predicting model of linear kernel function RVM recurrence to pH value is imitated from Fig. 2~5 Fruit preferably, although slightly weaker to the prediction effect of dissolved oxygen, permanganate index and ammonia nitrogen, but still is subjected to.By Fig. 6~ 9 can be seen that, although predicted value and original value of the time series predicting model that gaussian kernel function RVM is returned to four kinds of water quality parameters Equal energy quite well, but the prediction effect to dissolved oxygen and permanganate index is substantially better than the prediction effect to pH value and ammonia nitrogen Really.
According to above-mentioned experiment, the time series predicting model that gaussian kernel function and linear kernel function RVM are returned is contrasted respectively Four kinds of predicting the outcome for water quality parameter are understood, prediction of the gaussian kernel function RVM regression time series forecast model to pH value is imitated Fruit is good without linear kernel function RVM regression time series forecast model, but linear kernel function RVM regression time series forecast model Predicting the outcome but without gaussian kernel function RVM regression time series forecast models to dissolved oxygen, permanganate index and ammonia nitrogen It is good.
Because RVM regression models are when predicted value is provided, confidential interval can be also simultaneously obtained, therefore can be predicted the outcome Credibility, so as to provide more reference informations for Water quality monitoring and management mechanism.Herein except being given in Fig. 2~9 Outside the predicted value and original value of water quality parameter, the confidential interval of statistically the most frequently used confidence level 95% is given.Due to Water quality parameter value actual value is all higher than zero, so the confidential interval needs of water quality parameter confidence level 95% are in original confidence level Minus part is removed in 95% confidential interval.By Fig. 2~9 as can be seen that RVM regression forecasting time series models are to four Planting water quality parameter can obtain good prediction effect, and water quality parameter original value is all fallen within confidential interval.If in addition, actual prison Measured value (i.e. original value) away from predicted value, and beyond confidential interval, then it is assumed that be likely to occur the accident of water pollution, must Early warning information can be sent when wanting, the reason for point out regulator further to check change of water quality.
Experiment 2:In order to further illustrate problem, below by the present invention RVM return time series modeling algorithm with it is normal The time series modeling algorithm that the SVM for seeing is returned is made comparisons.Specifically from coefficient R, mean square error MSE, predicted time and pass (corresponding with interconnection vector in SVM is supporting vector (Support for connection vector number nRV or supporting vector number nSV Vector)) this four aspects are compared.
Table 1~4 gives time series forecasting of the RVM respectively and SVM recurrence to each water quality parameter and compares.
The time series forecasting results contrast of table 1PH values
The time series forecasting results contrast of the dissolved oxygen of table 2
The time series forecasting results contrast of the permanganate index of table 3
The time series forecasting results contrast of the ammonia nitrogen of table 4
The time series forecasting result that RVM in contrast table 1 and SVM is returned understands, for pH value, according to same Kernel function, the coefficient correlation of RVM time series predicting models is significantly greater than SVM time series predicting models, mean square error, pre- Survey time and interconnection vector (or supporting vector) number are all considerably less than SVM time series predicting models.And contrast linear kernel letter Number and gaussian kernel function understand that the prediction effect of linear kernel function is better than gaussian kernel function.As can be seen that SVM from table 2 and 3 Regression time series forecast model is better than RVM, coefficient correlation difference to dissolved oxygen and permanganate index in mean square error MSE Less, but supporting vector number is but tens times of RVM regression time series forecast model interconnection vector numbers even hundreds of times.Always For two kinds of time series predicting models the prediction effect of dissolved oxygen and permanganate index is more or less the same.By can in table 4 Know, when using gaussian kernel function, RVM regression time series forecast model to the prediction of ammonia nitrogen mean square error, coefficient correlation, Run time and interconnection vector are all good than what SVM was returned, when using linear kernel function, RVM regression time series prediction mould Type other indexs in addition to mean square error are better all than SVM.
Predicting the outcome for consolidated statement 1~4 can find that RVM has good generalization ability, two kinds of time serieses as SVM Forecast model is obtained good predicting the outcome.And the coefficient correlation of RVM time series predicting models is typically greater than SVM , interconnection vector number is far less than SVM supporting vector numbers, and predicted time is shorter than SVM.
More fully to compare the performance of RVM time series predicting models and SVM time series predicting models, four are depicted The relative error curve map of individual water quality parameter time series forecasting (but to obtain clearly comparison diagram, only draws test data set In front 50 groups of data each forecast model relative error curve map) as shown in Figure 10~13.
As can be seen from Figure 10, for pH value is predicted, the relative of linear kernel function RVM regression time series forecast model is missed Difference is minimum, and the RVM and SVM regression time series forecast model of gaussian kernel function is larger in the relative error of indivedual points, without line The effect of property kernel function SVM regression time series forecast model is good.Knowable to observation Figure 11, gaussian kernel function RVM regression time sequences Row forecast model is minimum to the relative error of dissolved oxygen prediction, and gaussian kernel function SVM regression time series forecast models many places go out Existing relative error is more a little bigger, and the relative error of the seed nucleus versus time sequential forecasting models of RVM two is more or less the same, but RVM on the whole Time series predicting model is more stable than SVM time series predicting model and more a little bigger without error.As can be seen from Figure 12 RVM with SVM regression time series forecast model is more or less the same to the relative error that permanganate index is predicted, but not to pH value and molten The prediction effect of solution oxygen is good.Gaussian kernel function and the prediction of linear kernel function RVM regression time series are clear that from Figure 13 The relative error that model is predicted ammonia nitrogen is all respectively than gaussian kernel function and linear kernel function SVM regression time series forecast model It is little, and gaussian kernel function SVM time series predicting models relative error occur in many places more a little bigger, have a strong impact on prediction effect Really.Comprehensive Figure 10~13 can be seen that RVM time series predicting models are better than SVM time series predicting models, and the relative of RVM is missed Difference is less, and performance is more stable, and can provide the probabilistic information of prediction.
Have that supporting vector number is more for SVM water quality parameter time series predicting models, predicted time is long, without probability defeated The problems such as going out, set forth herein being returned the method for setting up water quality parameter time series predicting model using RVM, and is selected respectively linear The RVM regression models of kernel function model and gaussian kernel function are predicted, and can be seen that original value all in confidence from predicting the outcome In the confidential interval of degree 95%.Carried out by returning water quality parameter time series forecast model with the SVM using corresponding kernel function Relatively understand, the precision of prediction of RVM models is not less than on the whole SVM models.Because RVM be given be prediction probability distribution, So while prediction is given, moreover it is possible to provide the confidential interval of prediction, so as to provide more for Water quality monitoring and management mechanism Reference information.Additionally, RVM regression models have it is very strong openness, with interconnection vector number it is few, predicted time is short, extensive The advantages of ability is strong.
Specific embodiment described herein is only explanation for example spiritual to the present invention.Technology neck belonging to of the invention The technical staff in domain can be made various modifications to described specific embodiment or supplement or replaced using similar mode Generation, but without departing from the spiritual of the present invention or surmount scope defined in appended claims.

Claims (5)

1. it is a kind of based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, it is characterised in that:Including following step Suddenly:
Step one:Water quality parameter historical data is gathered from Water Automatic Monitoring System and data are pre-processed, by historical data In missing data completion, first to missing data make mend 0 process, then pre- in time domain is made according to time series to historical data Process, then carry out frequency filtering, finally best fit comparison is carried out using least square method, in the curve for obtaining finally is fitted The match value that correspondence point of observation is 0 is found out, is the completion value of actual missing data, so as to substitute into completion value by historical data Missing data completion;
Step 2:Using before in the water quality parameter historical data of pretreatment 2/3 data as training sample set, afterwards 1/3 Data are used as test sample collection;
Step 3:The water quality parameter value of some continuous unit interval is used as input using before training sample set, with next unit interval Water quality parameter value as output, RVM is trained;Tested with the RVM after the training of test sample set pair, by test specimens The water quality parameter value of some continuous unit interval sends into the input of the RVM after training before this collection, and observes the output end of the RVM Predicted value, the error between the next unit interval water quality parameter value of predicted value and test sample collection of output end reaches requirement In the case of, upcheck, obtain the water quality parameter time series predicting model returned based on RVM;
Step 4:New water quality parameter is predicted using the water quality parameter time series predicting model returned based on RVM, i.e., The water quality parameter value of some unit interval sends into the input of forecast model before will be new, then predict next list in its output end The water quality parameter value of position time.
2. it is according to claim 1 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature It is:The water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
3. it is according to claim 2 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature It is:Water quality parameter time series predicting model in the step 3 based on RVM recurrence is as follows:For given water quality parameter PH The input of value, dissolved oxygen content, permanganate index or ammonia-nitrogen content is x*, then prediction average y for accordingly exporting*And variance RespectivelyPrediction output t*Obedience average is y*, variance be's Gaussian is distributed, i.e.,Wherein μTPosteriority weight average value is represented,It is noise variance, then in advance Survey is output asSetting reliability is 1- θ, then t* Two-sided confidence interval can be obtained by following formula:That is p { y**zθ/2<t*<y** zθ/2}=1- θ, obtain t*Confidence level for 1- θ confidential interval be [y**zθ/2,y**zθ/2], upper quantile zθ/2By mark Quasi normal distribution table is checked in, and seeks 95% confidential interval.
4. it is according to claim 1 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature It is:When error is checked using mean square error MSE (Mean Square Error, MSE), coefficient R in the step 3 (Correlation Coefficient), used as the index of valuation prediction models performance, its computing formula is respectively:
M S E = 1 n &Sigma; i = 1 n ( y a i - y p i ) 2
R = &Sigma; i = 1 n ( y a i - y &OverBar; a ) ( y p i - y &OverBar; p ) &Sigma; i = 1 n ( y a i - y &OverBar; a ) 2 &CenterDot; ( y p i - y &OverBar; p ) 2
The less estimated performance for representing model of mean square error is better, and it is more accurate that the absolute value of coefficient correlation is closer to 1 explanation prediction Really, wherein yaiAnd ypiThe actual value and predicted value of i-th sample of water quality parameter are represented respectively,WithRepresent respectively corresponding N actual value average of water quality parameter and predicted value average.
5. it is according to claim 4 based on interconnection vector machine return water quality parameter Time Series Forecasting Methods, its feature It is:It is different for the requirement that different water quality parameter errors reaches in the step 3:To pH value prediction model it is square Error is less than 0.004, is less than 0.08 to the mean square error of the model of dissolved oxygen content prediction, the mould to permanganate index prediction The mean square error of type is less than 0.02, and to the mean square error of the model of ammonia-nitrogen content prediction 0.002, and the model of above-mentioned prediction are less than Coefficient correlation be all not less than 0.95.
CN201410196457.4A 2014-05-09 2014-05-09 Water quality parameter time series prediction method based on relevance vector machine regression Active CN103942457B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410196457.4A CN103942457B (en) 2014-05-09 2014-05-09 Water quality parameter time series prediction method based on relevance vector machine regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410196457.4A CN103942457B (en) 2014-05-09 2014-05-09 Water quality parameter time series prediction method based on relevance vector machine regression

Publications (2)

Publication Number Publication Date
CN103942457A CN103942457A (en) 2014-07-23
CN103942457B true CN103942457B (en) 2017-04-12

Family

ID=51190125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410196457.4A Active CN103942457B (en) 2014-05-09 2014-05-09 Water quality parameter time series prediction method based on relevance vector machine regression

Country Status (1)

Country Link
CN (1) CN103942457B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104318325B (en) * 2014-10-14 2017-11-07 广东省环境监测中心 Many basin real-time intelligent water quality prediction methods and system
CN105676670B (en) * 2014-11-18 2019-07-19 北京翼虎能源科技有限公司 For handling the method and system of multi-energy data
CN106156260B (en) * 2015-04-28 2020-01-21 阿里巴巴集团控股有限公司 Method and device for repairing missing data
CN107977724A (en) * 2016-10-21 2018-05-01 复凌科技(上海)有限公司 A kind of water quality hard measurement Forecasting Methodology of permanganate index
CN107153874B (en) * 2017-04-11 2019-12-20 中国农业大学 Water quality prediction method and system
CN107392786B (en) * 2017-07-11 2021-04-16 中国矿业大学 Missing data compensation method for mine fiber bragg grating monitoring system based on support vector machine
CN107480028B (en) * 2017-07-21 2020-09-18 东软集团股份有限公司 Method and device for acquiring usable residual time of disk
CN107688871B (en) * 2017-08-18 2020-08-21 中国农业大学 Water quality prediction method and device
CN109241607B (en) * 2017-09-27 2023-05-30 山东农业大学 Proportioning variable fertilization discrete element model parameter calibration method based on correlation vector machine
CN109669017B (en) * 2017-10-17 2021-04-27 中国石油化工股份有限公司 Refinery distillation tower top cut water ion concentration prediction method based on deep learning
CN108334977B (en) * 2017-12-28 2020-06-30 鲁东大学 Deep learning-based water quality prediction method and system
JP7140410B2 (en) * 2018-03-30 2022-09-21 Necソリューションイノベータ株式会社 Forecasting system, forecasting method and forecasting program
CN108764520B (en) * 2018-04-11 2021-11-16 杭州电子科技大学 Water quality parameter prediction method based on multilayer cyclic neural network and D-S evidence theory
CN108595892A (en) * 2018-05-11 2018-09-28 南京林业大学 Soft-measuring modeling method based on time difference model
US10521701B2 (en) * 2018-05-18 2019-12-31 Google Llc Parallel decoding using autoregressive machine learning models
CN108710974B (en) * 2018-05-18 2020-09-11 中国农业大学 Water ammonia nitrogen prediction method and device based on deep belief network
CN108846423A (en) * 2018-05-29 2018-11-20 中国农业大学 Water quality prediction method and system
CN109165247B (en) * 2018-09-30 2021-07-23 中冶华天工程技术有限公司 Intelligent pretreatment method for sewage measurement data
CN109784528A (en) * 2018-12-05 2019-05-21 鲁东大学 Water quality prediction method and device based on time series and support vector regression
CN110182871A (en) * 2019-07-10 2019-08-30 银天远创(厦门)科技有限公司 A kind of method for treating water and terminal based on full-automatic medicine system
CN110245881A (en) * 2019-07-16 2019-09-17 重庆邮电大学 A kind of water quality prediction method and system of the sewage treatment based on machine learning
CN112182830B (en) * 2019-08-06 2022-10-18 长春工业大学 Water quality parameter prediction method
JP6999137B2 (en) * 2019-08-20 2022-01-18 株式会社カサイ Water quality management equipment and methods for aquaculture ponds
CN110889085A (en) * 2019-09-30 2020-03-17 华南师范大学 Intelligent wastewater monitoring method and system based on complex network multiple online regression
CN110838344B (en) * 2019-11-08 2023-04-07 北京理工大学 Water quality data analysis method
CN111080502B (en) * 2019-12-17 2023-09-08 清华苏州环境创新研究院 Big data identification method for regional enterprise data abnormal behaviors
CN112036082B (en) * 2020-08-27 2022-03-08 东北大学秦皇岛分校 Time series data prediction method based on attention mechanism
CN112489402A (en) * 2020-11-27 2021-03-12 罗普特科技集团股份有限公司 Early warning method, device and system for pipe gallery and storage medium
CN113281478A (en) * 2021-04-20 2021-08-20 广州珠水生态环境技术有限公司 Water quality acid-base nature of water resource environmental protection restores and uses monitoring system
CN113449789B (en) * 2021-06-24 2024-05-03 北京市生态环境监测中心 Quality control method for monitoring water quality by full spectrum water quality monitoring equipment based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968573A (en) * 2012-12-14 2013-03-13 哈尔滨工业大学 Online lithium ion battery residual life predicting method based on relevance vector regression
CN103020642A (en) * 2012-10-08 2013-04-03 江苏省环境监测中心 Water environment monitoring and quality-control data analysis method
CN103235096A (en) * 2013-04-16 2013-08-07 广州铁路职业技术学院 Sewage water quality detection method and apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020642A (en) * 2012-10-08 2013-04-03 江苏省环境监测中心 Water environment monitoring and quality-control data analysis method
CN102968573A (en) * 2012-12-14 2013-03-13 哈尔滨工业大学 Online lithium ion battery residual life predicting method based on relevance vector regression
CN103235096A (en) * 2013-04-16 2013-08-07 广州铁路职业技术学院 Sewage water quality detection method and apparatus

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106872657A (en) * 2017-01-05 2017-06-20 河海大学 A kind of multivariable water quality parameter time series data accident detection method

Also Published As

Publication number Publication date
CN103942457A (en) 2014-07-23

Similar Documents

Publication Publication Date Title
CN103942457B (en) Water quality parameter time series prediction method based on relevance vector machine regression
CN103136539B (en) Ground net corrosion speed grade Forecasting Methodology
Cui et al. Deep learning-based time-varying parameter identification for system-wide load modeling
CN105389980B (en) Short-time Traffic Flow Forecasting Methods based on long short-term memory recurrent neural network
Shamshad et al. First and second order Markov chain models for synthetic generation of wind speed time series
CN109934337A (en) A kind of detection method of the spacecraft telemetry exception based on integrated LSTM
CN108009674A (en) Air PM2.5 concentration prediction methods based on CNN and LSTM fused neural networks
CN110472846A (en) Nuclear power plant&#39;s thermal-hydraulic safety analysis the best-estimated adds uncertain method
CN103942461A (en) Water quality parameter prediction method based on online sequential extreme learning machine
CN102542169B (en) Linear selecting method in computing process of hydrological frequency
CN112001110B (en) Structural damage identification monitoring method based on vibration signal space real-time recurrent graph convolutional neural network
CN110501646A (en) Off-line lithium battery residual capacity estimation method
CN113762486B (en) Method and device for constructing fault diagnosis model of converter valve and computer equipment
CN108921279A (en) Reservoir day enters water prediction technique
Zhang et al. An anomaly identification model for wind turbine state parameters
CN106203723A (en) Wind power short-term interval prediction method based on RT reconstruct EEMD RVM built-up pattern
CN103268279B (en) Based on the software reliability prediction method of compound poisson process
CN111767517A (en) BiGRU multi-step prediction method and system applied to flood prediction and storage medium
CN105930670A (en) Model parameter uncertainty-based dynamic prediction method for river emergency pollution accident
CN106295121A (en) Landscape impoundments Bayes&#39;s water quality grade Forecasting Methodology
CN103018063A (en) Bridge random fatigue life prediction method based on Mittag-Leffler distribution
CN109034225A (en) A kind of combination stochastic variable ash and the modified uncertain parameters estimation method of Bayesian model
CN112307536A (en) Dam seepage parameter inversion method
Katipoğlu Monthly stream flows estimation in the Karasu river of Euphrates basin with artificial neural networks approach
CN107194507A (en) A kind of short-term wind speed forecasting method of wind farm based on combination SVMs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant