CN103942457A - Water quality parameter time series prediction method based on relevance vector machine regression - Google Patents
Water quality parameter time series prediction method based on relevance vector machine regression Download PDFInfo
- Publication number
- CN103942457A CN103942457A CN201410196457.4A CN201410196457A CN103942457A CN 103942457 A CN103942457 A CN 103942457A CN 201410196457 A CN201410196457 A CN 201410196457A CN 103942457 A CN103942457 A CN 103942457A
- Authority
- CN
- China
- Prior art keywords
- water quality
- quality parameter
- time series
- prediction
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention provides a water quality parameter time series prediction method based on relevance vector machine regression. The water quality parameter time series prediction method comprises the following steps of 1 acquiring water quality parameter historical data from an automatic water quality monitoring station and performing data pre-processing; 2 using front 2/3 data in the pro-processed water quality parameter historical data as a training sample set and using rear 1/3 data as a testing sample set; 3 using the training sample set to train an RVM, using the testing sample set to test the trained RVM so as to obtain a water quality parameter time series prediction model based on the RVM regression; 4 using the water quality parameter time series prediction model based on the RVM regression to predict new water quality parameters. The water quality parameter time series prediction method can perform time series prediction, is large in prediction range, high in accuracy and good in prediction stability, and can provide probabilistic output, give a predicted confidence interval while performing prediction, reduce the prediction time and timely observe water quality parameter change.
Description
Technical field
The present invention relates to water quality monitoring field, be specifically related to the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine.
Background technology
Water quality parameter time series is an orderly Monitoring Data sequence, and it has embodied certain water quality parameter distribution situation in time, if certain basin section was certain year the 1st Monitoring Data of water quality parameter pH value of thoughtful the 50th week.Water quality parameter Time Series Forecasting Methods is to utilize acquired historical time arrangement set, analyze inherent the statistical properties and the rule of development of the historical data in set, set up water quality parameter time series predicting model, and utilize this model to obtain predicted data to show the development trend of Future Data.Water quality parameter time series forecasting is water environment management and pollute the element task of controlling.At present China's water pollution accident, owing to lacking information and the technical support in early stage, is to add up mostly afterwards, the variation of unpredictable water quality and avoid the generation of contamination accident.Therefore, set up one of study hotspot that reliable water quality parameter time series predicting model is water environment scientific domain in recent years.Common water quality parameter time series predicting model is mainly artificial neural network and support vector machine (Support Vector Machine both at home and abroad at present, SVM) regression time sequential forecasting models, but artificial neural network algorithm was prone to study or owed study, local minimum, network structure is difficult to determine, the problems such as Generalization Ability is poor, and SVM regression model is a kind of supervision formula learning method being based upon on Statistical Learning Theory and structural risk minimization basis, the method is mapped to original input by kernel function the high-dimensional feature space of linear separability, there is generalization ability strong, be difficult for occurring the advantages such as over-fitting, can solve preferably small sample, non-linear, the problems such as high dimension drawn game portion minimal point, compared with artificial neural network time series predicting model, time series predicting model performance based on SVM increases, but in SVM time series predicting model, kernel function must meet Mercer condition, the number of support vector can increase along with the increase of training sample is linear, and only provide deterministic predicting the outcome, there is no probability output, be unable to estimate the uncertainty of prediction, and the prediction of probabilistic type can provide important information in actual applications, contribute to determine the confidence level of water quality parameter prediction.Interconnection vector machine (Relevance Vector Machine, RVM) a kind of newer machine learning algorithm that to be Tipping propose on the basis of calendar year 2001 at Bayesian frame, its kernel function needn't meet Mercer condition, the sparse property of separating is also far above SVM, and can provide the probabilistic information of prediction, have good generalization ability, RVM has obtained application in solving pattern-recognition and returning many practical problemss such as estimation, and has obtained good effect.The Chinese patent that application number is 20131013190.7 provides a kind of sewage quality monitoring method and device, the forecast model that the method adopts is the flexible measurement method based on interconnection vector machine, compare the model that adopts neural network and model construction of SVM method to set up, there is the precision of prediction of applicability and Geng Gao better, but the method has following defect: one: be the content that water outlet total nitrogen or water outlet total phosphorus were analyzed and then obtained to correlation parameter due to what adopt, the uncertainty of related data can greatly affect the data result of its final output, although data output is compared the model of setting up with neural network and model construction of SVM method and is greatly improved, but the instability of the data result of its final output still exists, two: just merely analyze the content of water outlet total nitrogen at that time or water outlet total phosphorus, cannot realize time series forecasting, the scope of prediction is little, and precision is low.
Summary of the invention
Technical matters to be solved by this invention is to provide the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine, can carry out time series forecasting, the scope of prediction is large, precision is high, the good stability of prediction, and can provide probability output, in providing prediction, provide the fiducial interval of prediction, reduce predicted time, observe in time the variation of water quality parameter.
For solving above-mentioned existing technical matters, the present invention adopts following scheme: the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine, comprises the following steps:
Step 1: gather water quality parameter historical data and data are carried out to pre-service from Water Automatic Monitoring System, by the missing data completion in historical data, first missing data being done to mend 0 processes, then historical data is done to the pre-service in time domain according to time series, carry out again frequency filtering, finally utilize least square method to carry out best-fit comparison, in the curve obtaining in final matching, find out corresponding observation point and be 0 match value, be the completion value of actual missing data, thereby substitution completion value is by the missing data completion in historical data;
Step 2: using the data through front 2/3 in pretreated water quality parameter historical data as training sample set, rear 1/3 data are as test sample book collection;
Step 3: using the water quality parameter values of some continuous unit interval before training sample set as input, using the water quality parameter value of next unit interval as output, RVM is trained; RVM after training with test sample book set pair tests, the water quality parameter value of some continuous unit interval before test sample book collection is sent into the input end of the RVM after training, and observe the predicted value of the output terminal of this RVM, error between predicted value and next unit interval water quality parameter value of test sample book collection of output terminal meets the requirements of in situation, upcheck, obtain the water quality parameter time series predicting model returning based on RVM;
Step 4: use the time series predicting model returning based on RVM to predict new water quality parameter, send into the input end of forecast model by the water quality parameter value of new front some unit interval, dope the water quality parameter value of next unit interval at its output terminal.
As preferably, described water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
As preferably, the water quality parameter time series predicting model returning based on RVM in described step 3 is as follows: for the x that is input as of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content
*, the prediction average y of corresponding output
*and variance
be respectively
Prediction output t
*obedience average is y
*, variance is
gaussian distribute,
wherein μ
Τrepresent posteriority weight average value,
be noise variance, prediction is output as
It is 1-θ, t that reliability is set
*two-sided confidence interval can be obtained by following formula:
be p{y
*-σ
*z
θ/2< t
*< y
*+ σ
*z
θ/2}=1-θ obtains t
*degree of confidence be that the fiducial interval of 1-θ is [y*
-σ
*z
θ/2, y
*+ σ
*z
θ/2] upper fractile z
θ/2check in by standardized normal distribution table, ask 95% fiducial interval.
As preferably, in described step 3, in the time of verify error, adopt square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) is as the index of valuation prediction models performance, its computing formula is respectively:
The estimated performance of the less expression model of square error is better, and the absolute value of related coefficient is more more accurate close to 1 explanation prediction, wherein y
aiand y
pirepresent respectively actual value and the predicted value of i sample of water quality parameter,
with
represent respectively n actual value average and the predicted value average of corresponding water quality parameter.
As preferably, the requirement difference reaching for different water quality parameter errors in described step 3: the square error of the model to pH value prediction is lower than 0.004, the square error of the model to dissolved oxygen content prediction is lower than 0.08, the square error of the model to permanganate index prediction is lower than 0.02, the square error of the model to ammonia-nitrogen content prediction is lower than 0.002, and the related coefficient of the model of above-mentioned prediction is all not less than 0.95.
Beneficial effect:
The present invention adopts technique scheme that the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine is provided, can carry out time series forecasting, the scope of prediction is large, precision is high, the good stability of prediction, and can provide probability output, in providing prediction, can also provide the fiducial interval of prediction, reduce predicted time, observe in time the variation of water quality parameter.
Brief description of the drawings
Fig. 1 is schematic flow sheet of the present invention;
Fig. 2 is that forecast model of the present invention adopts the time series forecasting result of linear kernel function to pH value;
Fig. 3 is that forecast model of the present invention adopts the time series forecasting result of linear kernel function to dissolved oxygen content;
Fig. 4 is that forecast model of the present invention adopts the time series forecasting result of linear kernel function to permanganate index;
Fig. 5 is that forecast model of the present invention adopts the time series forecasting result of linear kernel function to ammonia-nitrogen content;
Fig. 6 is that forecast model of the present invention adopts the time series forecasting result of gaussian kernel function to pH value;
Fig. 7 is that forecast model of the present invention adopts the time series forecasting result of gaussian kernel function to dissolved oxygen content;
Fig. 8 is that forecast model of the present invention adopts the time series forecasting result of gaussian kernel function to permanganate index;
Fig. 9 is that forecast model of the present invention adopts the time series forecasting result of gaussian kernel function to ammonia-nitrogen content;
To be pH value adopt the forecast model of linear kernel function or gaussian kernel function, support vector machine to adopt the relative error curve map linear kernel function or gaussian kernel function at forecast model of the present invention to Figure 10;
To be dissolved oxygen content adopt the forecast model of linear kernel function or gaussian kernel function, support vector machine to adopt the relative error curve map linear kernel function or gaussian kernel function at forecast model of the present invention to Figure 11;
To be permanganate index adopt the forecast model of linear kernel function or gaussian kernel function, support vector machine to adopt the relative error curve map linear kernel function or gaussian kernel function at forecast model of the present invention to Figure 12;
To be ammonia-nitrogen content adopt the forecast model of linear kernel function or gaussian kernel function, support vector machine to adopt the relative error curve map linear kernel function or gaussian kernel function at forecast model of the present invention to Figure 13.
Embodiment
As shown in Figure 1, the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine, comprises the following steps:
Step 1: gather water quality parameter historical data and data are carried out to pre-service from Water Automatic Monitoring System, by the missing data completion in historical data, first missing data being done to mend 0 processes, then historical data is done to the pre-service in time domain according to time series, carry out again frequency filtering, finally utilize least square method to carry out best-fit comparison, in the curve obtaining in final matching, find out corresponding observation point and be 0 match value, be the completion value of actual missing data, thereby substitution completion value is by the missing data completion in historical data;
Step 2: using the data through front 2/3 in pretreated water quality parameter historical data as training sample set, rear 1/3 data are as test sample book collection;
Step 3: using the water quality parameter values of some continuous unit interval before training sample set as input, using the water quality parameter value of next unit interval as output, RVM is trained; RVM after training with test sample book set pair tests, the water quality parameter value of some continuous unit interval before test sample book collection is sent into the input end of the RVM after training, and observe the predicted value of the output terminal of this RVM, error between predicted value and next unit interval water quality parameter value of test sample book collection of output terminal meets the requirements of in situation, upcheck, obtain the water quality parameter time series predicting model returning based on RVM;
Step 4: use the time series predicting model returning based on RVM to predict new water quality parameter, send into the input end of forecast model by the water quality parameter value of new front some unit interval, dope the water quality parameter value of next unit interval at its output terminal.
Described water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.The water quality parameter time series predicting model returning based on RVM in described step 3 is as follows: for the x that is input as of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content
*, the prediction average y of corresponding output
*and variance
be respectively
Prediction output t
*obedience average is y
*, variance is
gaussian distribute,
wherein μ
Τrepresent posteriority weight average value,
be noise variance, prediction is output as
It is 1-θ, t that reliability is set
*two-sided confidence interval can be obtained by following formula:
be p{y
*-σ
*z
θ/2< t
*< y
*+ σ
*z
θ/2}=1-θ, obtains t
*degree of confidence be that the fiducial interval of 1-θ is [y
*-σ
*z
θ/2, y
*+ σ
*z
θ/2], upper fractile z
θ/2check in by standardized normal distribution table, ask 95% fiducial interval.In described step 3, in the time of verify error, adopt square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) is as the index of valuation prediction models performance, its computing formula is respectively:
The estimated performance of the less expression model of square error is better, and the absolute value of related coefficient is more more accurate close to 1 explanation prediction, wherein y
aiand y
pirepresent respectively actual value and the predicted value of i sample of water quality parameter,
with
represent respectively n actual value average and the predicted value average of corresponding water quality parameter.The requirement difference reaching for different water quality parameter errors in described step 3: the square error of the model to pH value prediction is lower than 0.004, the square error of the model to dissolved oxygen content prediction is lower than 0.08, the square error of the model to permanganate index prediction is lower than 0.02, the square error of the model to ammonia-nitrogen content prediction is lower than 0.002, and the related coefficient of the model of above-mentioned prediction is all not less than 0.95.
The time series predicting model of water quality parameter is expressed as follows:
If time series is
wherein N is sequence length, y
nfor the water quality parameter monitor value in n moment, x
n=[y
n-d τ, y
n-(d-1) ..., y
n-τ] be the vector of d monitor value composition before, d is for embedding dimension here, and τ is time delay, has certain mapping relations:
y
n=F(x
n),n=1,2,…,N
The key that realizes water quality parameter prediction is the accurate simulation to F (), builds training sample set for this reason
wherein x
n=[y
n-d τ, y
n-(d-1) τ..., y
n-τ]
tfor input sample, t
n=y
nfor output sample, utilize this training sample set pair interconnection vector machine to train, set up water quality parameter time series predicting model, wherein adopt d=4, τ is that the recurrence of 1 week postpones, and predicts next weekly data with front 4 weekly datas.
The derivation step of the water quality parameter time series predicting model returning based on RVM is as follows:
Step 1: the training sample set of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content
4 dimension input vectors, t
nbe output, suppose independent distribution both, and relation between them can be expressed as t
n=y (x
n; W)+ε
n, wherein ε
nindependent identically distributed Gaussian noise, and ε
n~N (0, σ
2), i.e. t
nobeying average is y (x
n, w), variance is σ
2gaussian distribution;
Step 2: the output of forecast model can be expressed as
k (x, x
i) be kernel function, kernel function adopts respectively linear kernel function or gaussian kernel function, w=[ω
0, ω
1..., ω
n]
tfor the weight vector of model, by ε
nmeet Gaussian and distribute, target output value t
nseparate, the likelihood function of whole training sample set is
T=[t in formula
1, t
2..., t
n]
t, Φ=[φ (x
1), φ (x
2) ..., φ (x
n)]
tfor the matrix of N × (N+1), φ (x
n)=[1, K (x
n, x
1), K (x
n, x
2) ..., K (x
n, x
n)]
t;
Step 3: for making model there is generalization, use Bayesian framework, introduce prior probability distribution:
the super parameter vector that in formula, α is made up of the super parameter of N+1, the posterior probability of training sample set distributes and can be tried to achieve by the reasoning of Bayesian formula:
The posterior probability of weight vectors ω is distributed as
Posterior variance and average are respectively
Step 4: obtain p (t| α, σ by training sample set being carried out to edge integration
2)=∫ p (t|w, σ
2) p (w| α) dW, thereby obtain the marginal likelihood function of super parameter: p (t| α, σ
2)=N (0, C), wherein C=σ
2i+ Φ A
-1Φ
t, super parameter alpha and σ
2the posterior probability that directly affects ω distributes, and the maximum a posteriori probability distribution that need to optimize to obtain ω to it, introduces delta function, is translated into super parameter posterior probability distribution p (α, σ
2| t) about α and σ
2max problem, the in the situation that of consistent super prior probability distribution, only need maximization marginal likelihood function;
Step 5: arrange according to MacKay method:
wherein μ
ibe i the element of mean vector μ, in MacKay method, define γ
i=1-α
iΣ
ii, i element on the diagonal line of variance Σ, upgrades by continuous iteration
(σ
2)
new, be that the gradient of Output rusults is less than 10 until all parameters all restrain
-3or till while reaching maximum frequency of training 1000, obtain super parameter alpha by maximum likelihood method
mPand noise variance
Step 6: if input given water quality parameter value x
*, the probability distribution of corresponding output is:
Obey Gaussian and distribute,
Wherein, prediction average and variance are respectively
T
*obedience average is y
*, variance is
gaussian distribute,
Step 7: it is 1-θ, t that reliability is set
*two-sided confidence interval can be obtained by following formula:
be p{y
*-σ
*z
θ/2< t
*< y
*+ σ
*z
θ/2}=1-θ, obtains t
*degree of confidence be that the fiducial interval of 1-θ is [y
*-σ
*z
θ/2, y
*+ σ
*z
θ/2], upper fractile Z
θ/2can check in by standardized normal distribution table, ask 95% fiducial interval.
According to the derivation of the time series predicting model returning based on RVM, determine that the concrete operation step of the time series predicting model returning based on RVM is as follows:
(1) determine the training sample set of water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content;
(2) select kernel function, and definite kernel function width gamma and noise variance σ
2;
(3) initialization α and σ
2;
(4) posterior variance Σ and the average μ of calculating weight vectors ω;
(5) upgrade
(σ
2)
new;
(6) circulation step (4) and step (5), until the gradient of maximum iteration time 1000 or Output rusults is less than 10
-3;
(7) delete in super parameter alpha and be more than or equal to α
max(get e
9) corresponding weight coefficient and basis function, the rarefaction of implementation model;
(8) test set to water quality parameter, the super parameter alpha being obtained by training
mPand noise variance
predict estimation.
Carry out labor below by two groups of experimental results; when the present invention carries out experiment test, the national main river emphasis section Sichuan dragon's cave-stalactite cave automatic water quality monitoring weekly (2004 the 1st thoughtful 2012 year the 53rd week) of announcing data from People's Republic of China's Environmental Protection Department (http://www.mep.gov.cn/).Select 2004 the 1st thoughtful 2009 year the 52nd week of Sichuan's dragon's cave-stalactite cave as training dataset, 2010 the 1st thoughtful 2012 year the 53rd week as test data set.Predict next weekly data with front 4 weekly datas, training dataset is 308 groups, and test data set is 153 groups.
Experiment 1: in view of choosing of kernel function has impact to a certain degree to modeling effect, in the present invention, when based on RVM regression modeling, linear kernel function and gaussian kernel function are chosen respectively, so that result is selected most suitable kernel function to each water quality parameter by experiment.
Fig. 2~9 have provided the modeling algorithm returning based on RVM and have chosen respectively linear kernel function and the gaussian kernel function time series forecasting result figure to each water quality parameter, have provided qualitatively the result while adopting different IPs function.
From Fig. 2~5, can find out, the time series predicting model that linear kernel function RVM returns is better to the prediction effect of pH value, although slightly weaker to the prediction effect of dissolved oxygen DO, permanganate index and ammonia nitrogen, but still can accept.Can be found out by Fig. 6~9, although predicted value and the original value all energy quite well of the time series predicting model that gaussian kernel function RVM returns to four kinds of water quality parameters, is obviously better than the prediction effect to pH value and ammonia nitrogen to the prediction effect of dissolved oxygen DO and permanganate index.
According to above-mentioned experiment, the time series predicting model that contrasts respectively gaussian kernel function and linear kernel function RVM recurrence is known to predicting the outcome of four kinds of water quality parameters, gaussian kernel function RVM regression time sequential forecasting models does not have linear kernel function RVM regression time sequential forecasting models good to the prediction effect of pH value, but linear kernel function RVM regression time sequential forecasting models does not have gaussian kernel function RVM regression time sequential forecasting models good to predicting the outcome of dissolved oxygen DO, permanganate index and ammonia nitrogen.
Because RVM regression model is in the time providing predicted value, also can obtain fiducial interval simultaneously, the credibility that therefore can obtain predicting the outcome, thus provide more reference information for Water quality monitoring and management mechanism.This paper except providing the predicted value and original value of water quality parameter, gives the fiducial interval of statistically the most frequently used degree of confidence 95% in Fig. 2~9.Because water quality parameter value actual value is all greater than zero, so the fiducial interval of water quality parameter degree of confidence 95% need to be removed minus part in the fiducial interval that be 95% in original degree of confidence.Can be found out by Fig. 2~9, RVM regression forecasting time series models all can obtain good prediction effect to four kinds of water quality parameters, and water quality parameter original value all drops in fiducial interval.In addition, if actual monitoring value (being original value), away from predicted value, and exceeds fiducial interval, think and may occur the accident of water pollution, can send if desired early warning information, prompting regulator further checks the reason of change of water quality.
Experiment 2: in order to further illustrate problem, the time series modeling algorithm that the time series modeling algorithm below RVM in the present invention being returned and common SVM return is made comparisons.Specifically counting nRV or support vector from coefficient R, square error MSE, predicted time and interconnection vector counts aspect these four of nSV (among SVM corresponding with interconnection vector be support vector (Support Vector)) and compares.
Table 1~4 have provided difference RVM and SVM returns the time series forecasting comparison to each water quality parameter.
The time series forecasting result comparison of table 1PH value
The time series forecasting result comparison of table 2 dissolved oxygen DO
The time series forecasting result comparison of table 3 permanganate index
The time series forecasting result comparison of table 4 ammonia nitrogen
The time series forecasting result that RVM in contrast table 1 and SVM return is known, for pH value, if adopt same kernel function, the related coefficient of RVM time series predicting model is obviously greater than SVM time series predicting model, and square error, predicted time and interconnection vector (or support vector) number is all obviously less than SVM time series predicting model.And contrast linear kernel function and gaussian kernel function are known, and the prediction effect of linear kernel function is better than gaussian kernel function.From table 2 and 3, can find out, SVM regression time sequential forecasting models is better than RVM to dissolved oxygen DO and permanganate index on square error MSE, related coefficient is more or less the same, but support vector number is but tens times of RVM regression time sequential forecasting models interconnection vector number even hundreds of times.Generally speaking two kinds of time series predicting models are more or less the same to the prediction effect of dissolved oxygen DO and permanganate index.From in table 4, in the time adopting gaussian kernel function, it is good that RVM regression time sequential forecasting models all returns than SVM at square error, related coefficient, working time and interconnection vector the prediction of ammonia nitrogen, in the time adopting linear kernel function, RVM regression time sequential forecasting models other indexs except square error are all better than SVM.
Predicting the outcome of consolidated statement 1~4 can find, RVM, as SVM, has good generalization ability, and two kinds of time series predicting models all can obtain good predicting the outcome.And the related coefficient of RVM time series predicting model is generally all greater than SVM's, interconnection vector number is far less than SVM support vector number, and predicted time is shorter than SVM.
For more fully comparing the performance of RVM time series predicting model and SVM time series predicting model, draw the relative error curve map (but for obtaining comparison diagram clearly, only drawing the relative error curve map of each forecast model of the concentrated front 50 groups of data of test data) of four water quality parameter time series forecastings as shown in Figure 10~13.
As can be seen from Figure 10, predict for pH value, the relative error minimum of linear kernel function RVM regression time sequential forecasting models, the RVM of gaussian kernel function and SVM regression time sequential forecasting models are larger in the relative error of indivedual points, there is no the effective of linear kernel function SVM regression time sequential forecasting models.Figure 11 is known in observation, the relative error minimum of gaussian kernel function RVM regression time sequential forecasting models to dissolved oxygen prediction, gaussian kernel function SVM regression time sequential forecasting models many places occur that relative error is more a little bigger, the relative error of two kinds of kernel function time series predicting models of RVM is more or less the same, but RVM time series predicting model is more stable and do not have error more a little bigger than SVM time series predicting model on the whole.RVM and SVM regression time sequential forecasting models are more or less the same to the relative error of permanganate index prediction as can be seen from Figure 12, but not good to the prediction effect of pH value and dissolved oxygen DO.Can know that from Figure 13 to find out gaussian kernel function and linear kernel function RVM regression time sequential forecasting models all little than gaussian kernel function and linear kernel function SVM regression time sequential forecasting models respectively to the relative error of ammonia nitrogen prediction, and gaussian kernel function SVM time series predicting model occurs that in many places relative error is more a little bigger, has a strong impact on prediction effect.Comprehensive Figure 10~13 can find out that RVM time series predicting model is better than SVM time series predicting model, and the relative error of RVM is less, and performance is more stable, and can provide the probabilistic information of prediction.
Exist support vector number many for SVM water quality parameter time series predicting model, predicted time is long, without problems such as probability outputs, propose herein to adopt RVM to return the method for setting up water quality parameter time series predicting model, and select respectively the RVM regression model of linear kernel function model and gaussian kernel function to predict, from predict the outcome, can find out that original value is all in the fiducial interval of degree of confidence 95%.Compare knownly by returning water quality parameter time series predicting model with the SVM that adopts corresponding kernel function, the precision of prediction of RVM model is not less than SVM model on the whole.What provide because of RVM is the probability distribution of prediction, thus in providing prediction, can also provide the fiducial interval of prediction, thus provide more reference information for Water quality monitoring and management mechanism.In addition, RVM regression model has very strong sparse property, has the advantages such as interconnection vector number is few, predicted time is short, generalization ability is strong.
Specific embodiment described herein is only to the explanation for example of the present invention's spirit.Those skilled in the art can make various amendments or supplement or adopt similar mode to substitute described specific embodiment, but can't depart from spirit of the present invention or surmount the defined scope of appended claims.
Claims (5)
1. the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine, is characterized in that: comprise the following steps:
Step 1: gather water quality parameter historical data and data are carried out to pre-service from Water Automatic Monitoring System, by the missing data completion in historical data, first missing data being done to mend 0 processes, then historical data is done to the pre-service in time domain according to time series, carry out again frequency filtering, finally utilize least square method to carry out best-fit comparison, in the curve obtaining in final matching, find out corresponding observation point and be 0 match value, be the completion value of actual missing data, thereby substitution completion value is by the missing data completion in historical data;
Step 2: using the data through front 2/3 in pretreated water quality parameter historical data as training sample set, rear 1/3 data are as test sample book collection;
Step 3: using the water quality parameter values of some continuous unit interval before training sample set as input, using the water quality parameter value of next unit interval as output, RVM is trained; RVM after training with test sample book set pair tests, the water quality parameter value of some continuous unit interval before test sample book collection is sent into the input end of the RVM after training, and observe the predicted value of the output terminal of this RVM, error between predicted value and next unit interval water quality parameter value of test sample book collection of output terminal meets the requirements of in situation, upcheck, obtain the water quality parameter time series predicting model returning based on RVM;
Step 4: use the time series predicting model returning based on RVM to predict new water quality parameter, send into the input end of forecast model by the water quality parameter value of new front some unit interval, dope the water quality parameter value of next unit interval at its output terminal.
2. the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine according to claim 1, is characterized in that: described water quality parameter adopts pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content.
3. the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine according to claim 2, is characterized in that: the water quality parameter time series predicting model returning based on RVM in described step 3 is as follows: for the x that is input as of given water quality parameter pH value, dissolved oxygen content, permanganate index or ammonia-nitrogen content
*, the prediction average y of corresponding output
*and variance
be respectively
Prediction output t
*obedience average is y
*, variance is
gaussian distribute,
wherein μ
Τrepresent posteriority weight average value,
be noise variance, prediction is output as
4. the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine according to claim 1, it is characterized in that: in described step 3, in the time of verify error, adopt square error MSE (Mean Square Error, MSE), coefficient R (Correlation Coefficient) is as the index of valuation prediction models performance, its computing formula is respectively:
5. the water quality parameter Time Series Forecasting Methods returning based on interconnection vector machine according to claim 4, it is characterized in that: the requirement difference reaching for different water quality parameter errors in described step 3: the square error of the model to pH value prediction is lower than 0.004, the square error of the model to dissolved oxygen content prediction is lower than 0.08, the square error of the model to permanganate index prediction is lower than 0.02, the square error of the model to ammonia-nitrogen content prediction is lower than 0.002, and the related coefficient of the model of above-mentioned prediction is all not less than 0.95.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410196457.4A CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410196457.4A CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103942457A true CN103942457A (en) | 2014-07-23 |
CN103942457B CN103942457B (en) | 2017-04-12 |
Family
ID=51190125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410196457.4A Active CN103942457B (en) | 2014-05-09 | 2014-05-09 | Water quality parameter time series prediction method based on relevance vector machine regression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103942457B (en) |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318325A (en) * | 2014-10-14 | 2015-01-28 | 广东省环境监测中心 | Multi-basin real-time intelligent water quality predication method and system |
CN105676670A (en) * | 2014-11-18 | 2016-06-15 | 北京翼虎能源科技有限公司 | Method and system for processing energy data |
CN106156260A (en) * | 2015-04-28 | 2016-11-23 | 阿里巴巴集团控股有限公司 | The method and apparatus that a kind of shortage of data is repaired |
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
CN107153874A (en) * | 2017-04-11 | 2017-09-12 | 中国农业大学 | Water quality prediction method and system |
CN107392786A (en) * | 2017-07-11 | 2017-11-24 | 中国矿业大学 | Mine fiber grating monitoring system missing data compensation method based on SVMs |
CN107480028A (en) * | 2017-07-21 | 2017-12-15 | 东软集团股份有限公司 | The acquisition methods and device of residual time length workable for disk |
CN107688871A (en) * | 2017-08-18 | 2018-02-13 | 中国农业大学 | A kind of water quality prediction method and device |
CN107977724A (en) * | 2016-10-21 | 2018-05-01 | 复凌科技(上海)有限公司 | A kind of water quality hard measurement Forecasting Methodology of permanganate index |
CN108334977A (en) * | 2017-12-28 | 2018-07-27 | 鲁东大学 | Water quality prediction method based on deep learning and system |
CN108595892A (en) * | 2018-05-11 | 2018-09-28 | 南京林业大学 | Soft-measuring modeling method based on time difference model |
CN108710974A (en) * | 2018-05-18 | 2018-10-26 | 中国农业大学 | A kind of water body ammonia nitrogen prediction technique and device based on depth confidence network |
CN108764520A (en) * | 2018-04-11 | 2018-11-06 | 杭州电子科技大学 | A kind of water quality parameter prediction technique based on multilayer circulation neural network and D-S evidence theory |
CN108846423A (en) * | 2018-05-29 | 2018-11-20 | 中国农业大学 | Water quality prediction method and system |
CN109165247A (en) * | 2018-09-30 | 2019-01-08 | 中冶华天工程技术有限公司 | Sewage measurement data intelligence preprocess method |
CN109241607A (en) * | 2017-09-27 | 2019-01-18 | 山东农业大学 | Matching variable fertilising discrete element analysis parameter calibration method based on Method Using Relevance Vector Machine |
CN109669017A (en) * | 2017-10-17 | 2019-04-23 | 中国石油化工股份有限公司 | Refinery's distillation tower top based on deep learning cuts water concentration prediction technique |
CN109784528A (en) * | 2018-12-05 | 2019-05-21 | 鲁东大学 | Water quality prediction method and device based on time series and support vector regression |
CN110182871A (en) * | 2019-07-10 | 2019-08-30 | 银天远创(厦门)科技有限公司 | A kind of method for treating water and terminal based on full-automatic medicine system |
CN110245359A (en) * | 2018-05-18 | 2019-09-17 | 谷歌有限责任公司 | Parallel decoding is carried out using autoregression machine learning model |
CN110245881A (en) * | 2019-07-16 | 2019-09-17 | 重庆邮电大学 | A kind of water quality prediction method and system of the sewage treatment based on machine learning |
CN110838344A (en) * | 2019-11-08 | 2020-02-25 | 北京理工大学 | Water quality data analysis method |
CN110889085A (en) * | 2019-09-30 | 2020-03-17 | 华南师范大学 | Intelligent wastewater monitoring method and system based on complex network multiple online regression |
CN111080502A (en) * | 2019-12-17 | 2020-04-28 | 清华苏州环境创新研究院 | Big data identification method for abnormal behavior of regional enterprise data |
CN111937012A (en) * | 2018-03-30 | 2020-11-13 | 日本电气方案创新株式会社 | Index calculation device, prediction system, progress prediction evaluation method, and program |
CN112036082A (en) * | 2020-08-27 | 2020-12-04 | 东北大学秦皇岛分校 | Time series data prediction method based on attention mechanism |
CN112182830A (en) * | 2019-08-06 | 2021-01-05 | 长春工业大学 | Water quality parameter prediction method |
CN112489402A (en) * | 2020-11-27 | 2021-03-12 | 罗普特科技集团股份有限公司 | Early warning method, device and system for pipe gallery and storage medium |
CN113281478A (en) * | 2021-04-20 | 2021-08-20 | 广州珠水生态环境技术有限公司 | Water quality acid-base nature of water resource environmental protection restores and uses monitoring system |
CN113449789A (en) * | 2021-06-24 | 2021-09-28 | 北京市生态环境监测中心 | Quality control method for monitoring water quality by full-spectrum water quality monitoring equipment based on big data |
CN114340384A (en) * | 2019-08-20 | 2022-04-12 | 卡塞株式会社 | Water quality management device and method for culture pond |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020642B (en) * | 2012-10-08 | 2016-07-13 | 江苏省环境监测中心 | Monitoring water environment Quality Control data analysing method |
CN102968573A (en) * | 2012-12-14 | 2013-03-13 | 哈尔滨工业大学 | Online lithium ion battery residual life predicting method based on relevance vector regression |
CN103235096A (en) * | 2013-04-16 | 2013-08-07 | 广州铁路职业技术学院 | Sewage water quality detection method and apparatus |
-
2014
- 2014-05-09 CN CN201410196457.4A patent/CN103942457B/en active Active
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104318325A (en) * | 2014-10-14 | 2015-01-28 | 广东省环境监测中心 | Multi-basin real-time intelligent water quality predication method and system |
CN104318325B (en) * | 2014-10-14 | 2017-11-07 | 广东省环境监测中心 | Many basin real-time intelligent water quality prediction methods and system |
CN105676670B (en) * | 2014-11-18 | 2019-07-19 | 北京翼虎能源科技有限公司 | For handling the method and system of multi-energy data |
CN105676670A (en) * | 2014-11-18 | 2016-06-15 | 北京翼虎能源科技有限公司 | Method and system for processing energy data |
CN106156260A (en) * | 2015-04-28 | 2016-11-23 | 阿里巴巴集团控股有限公司 | The method and apparatus that a kind of shortage of data is repaired |
CN106156260B (en) * | 2015-04-28 | 2020-01-21 | 阿里巴巴集团控股有限公司 | Method and device for repairing missing data |
CN107977724A (en) * | 2016-10-21 | 2018-05-01 | 复凌科技(上海)有限公司 | A kind of water quality hard measurement Forecasting Methodology of permanganate index |
CN106872657A (en) * | 2017-01-05 | 2017-06-20 | 河海大学 | A kind of multivariable water quality parameter time series data accident detection method |
CN107153874A (en) * | 2017-04-11 | 2017-09-12 | 中国农业大学 | Water quality prediction method and system |
CN107153874B (en) * | 2017-04-11 | 2019-12-20 | 中国农业大学 | Water quality prediction method and system |
CN107392786A (en) * | 2017-07-11 | 2017-11-24 | 中国矿业大学 | Mine fiber grating monitoring system missing data compensation method based on SVMs |
CN107480028A (en) * | 2017-07-21 | 2017-12-15 | 东软集团股份有限公司 | The acquisition methods and device of residual time length workable for disk |
CN107480028B (en) * | 2017-07-21 | 2020-09-18 | 东软集团股份有限公司 | Method and device for acquiring usable residual time of disk |
CN107688871A (en) * | 2017-08-18 | 2018-02-13 | 中国农业大学 | A kind of water quality prediction method and device |
CN107688871B (en) * | 2017-08-18 | 2020-08-21 | 中国农业大学 | Water quality prediction method and device |
CN109241607A (en) * | 2017-09-27 | 2019-01-18 | 山东农业大学 | Matching variable fertilising discrete element analysis parameter calibration method based on Method Using Relevance Vector Machine |
CN109669017B (en) * | 2017-10-17 | 2021-04-27 | 中国石油化工股份有限公司 | Refinery distillation tower top cut water ion concentration prediction method based on deep learning |
CN109669017A (en) * | 2017-10-17 | 2019-04-23 | 中国石油化工股份有限公司 | Refinery's distillation tower top based on deep learning cuts water concentration prediction technique |
CN108334977A (en) * | 2017-12-28 | 2018-07-27 | 鲁东大学 | Water quality prediction method based on deep learning and system |
CN108334977B (en) * | 2017-12-28 | 2020-06-30 | 鲁东大学 | Deep learning-based water quality prediction method and system |
CN111937012A (en) * | 2018-03-30 | 2020-11-13 | 日本电气方案创新株式会社 | Index calculation device, prediction system, progress prediction evaluation method, and program |
CN108764520A (en) * | 2018-04-11 | 2018-11-06 | 杭州电子科技大学 | A kind of water quality parameter prediction technique based on multilayer circulation neural network and D-S evidence theory |
CN108764520B (en) * | 2018-04-11 | 2021-11-16 | 杭州电子科技大学 | Water quality parameter prediction method based on multilayer cyclic neural network and D-S evidence theory |
CN108595892A (en) * | 2018-05-11 | 2018-09-28 | 南京林业大学 | Soft-measuring modeling method based on time difference model |
CN108710974A (en) * | 2018-05-18 | 2018-10-26 | 中国农业大学 | A kind of water body ammonia nitrogen prediction technique and device based on depth confidence network |
CN110245359B (en) * | 2018-05-18 | 2024-01-26 | 谷歌有限责任公司 | Parallel decoding using autoregressive machine learning model |
CN110245359A (en) * | 2018-05-18 | 2019-09-17 | 谷歌有限责任公司 | Parallel decoding is carried out using autoregression machine learning model |
CN108710974B (en) * | 2018-05-18 | 2020-09-11 | 中国农业大学 | Water ammonia nitrogen prediction method and device based on deep belief network |
CN108846423A (en) * | 2018-05-29 | 2018-11-20 | 中国农业大学 | Water quality prediction method and system |
CN109165247A (en) * | 2018-09-30 | 2019-01-08 | 中冶华天工程技术有限公司 | Sewage measurement data intelligence preprocess method |
CN109165247B (en) * | 2018-09-30 | 2021-07-23 | 中冶华天工程技术有限公司 | Intelligent pretreatment method for sewage measurement data |
CN109784528A (en) * | 2018-12-05 | 2019-05-21 | 鲁东大学 | Water quality prediction method and device based on time series and support vector regression |
CN110182871A (en) * | 2019-07-10 | 2019-08-30 | 银天远创(厦门)科技有限公司 | A kind of method for treating water and terminal based on full-automatic medicine system |
CN110245881A (en) * | 2019-07-16 | 2019-09-17 | 重庆邮电大学 | A kind of water quality prediction method and system of the sewage treatment based on machine learning |
CN112182830B (en) * | 2019-08-06 | 2022-10-18 | 长春工业大学 | Water quality parameter prediction method |
CN112182830A (en) * | 2019-08-06 | 2021-01-05 | 长春工业大学 | Water quality parameter prediction method |
CN114340384A (en) * | 2019-08-20 | 2022-04-12 | 卡塞株式会社 | Water quality management device and method for culture pond |
CN114340384B (en) * | 2019-08-20 | 2023-09-26 | 卡塞株式会社 | Water quality management device and method for culture pond |
CN110889085A (en) * | 2019-09-30 | 2020-03-17 | 华南师范大学 | Intelligent wastewater monitoring method and system based on complex network multiple online regression |
CN110838344B (en) * | 2019-11-08 | 2023-04-07 | 北京理工大学 | Water quality data analysis method |
CN110838344A (en) * | 2019-11-08 | 2020-02-25 | 北京理工大学 | Water quality data analysis method |
CN111080502A (en) * | 2019-12-17 | 2020-04-28 | 清华苏州环境创新研究院 | Big data identification method for abnormal behavior of regional enterprise data |
CN111080502B (en) * | 2019-12-17 | 2023-09-08 | 清华苏州环境创新研究院 | Big data identification method for regional enterprise data abnormal behaviors |
CN112036082B (en) * | 2020-08-27 | 2022-03-08 | 东北大学秦皇岛分校 | Time series data prediction method based on attention mechanism |
CN112036082A (en) * | 2020-08-27 | 2020-12-04 | 东北大学秦皇岛分校 | Time series data prediction method based on attention mechanism |
CN112489402A (en) * | 2020-11-27 | 2021-03-12 | 罗普特科技集团股份有限公司 | Early warning method, device and system for pipe gallery and storage medium |
CN113281478A (en) * | 2021-04-20 | 2021-08-20 | 广州珠水生态环境技术有限公司 | Water quality acid-base nature of water resource environmental protection restores and uses monitoring system |
CN113449789A (en) * | 2021-06-24 | 2021-09-28 | 北京市生态环境监测中心 | Quality control method for monitoring water quality by full-spectrum water quality monitoring equipment based on big data |
Also Published As
Publication number | Publication date |
---|---|
CN103942457B (en) | 2017-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942457A (en) | Water quality parameter time series prediction method based on relevance vector machine regression | |
Sun et al. | Using Bayesian deep learning to capture uncertainty for residential net load forecasting | |
US10290066B2 (en) | Method and device for modeling a long-time-scale photovoltaic output time sequence | |
Liu et al. | Coupling the k-nearest neighbor procedure with the Kalman filter for real-time updating of the hydraulic model in flood forecasting | |
CN105391083B (en) | Wind power interval short term prediction method based on variation mode decomposition and Method Using Relevance Vector Machine | |
CN101587155B (en) | Oil soaked transformer fault diagnosis method | |
CN102185735B (en) | Network security situation prediction method | |
CN105376097A (en) | Hybrid prediction method for network traffic | |
Heng et al. | Probabilistic and deterministic wind speed forecasting based on non-parametric approaches and wind characteristics information | |
CN105335756A (en) | Robust learning model and image classification system | |
CN103226595B (en) | The clustering method of the high dimensional data of common factor analyzer is mixed based on Bayes | |
CN106203723A (en) | Wind power short-term interval prediction method based on RT reconstruct EEMD RVM built-up pattern | |
Moeini et al. | Fitting the three-parameter Weibull distribution with Cross Entropy | |
CN111625516A (en) | Method and device for detecting data state, computer equipment and storage medium | |
CN103235096A (en) | Sewage water quality detection method and apparatus | |
CN107798426A (en) | Wind power interval Forecasting Methodology based on Atomic Decomposition and interactive fuzzy satisfying method | |
CN103617259A (en) | Matrix decomposition recommendation method based on Bayesian probability with social relations and project content | |
Feng et al. | Improved prediction model for flood-season rainfall based on a nonlinear dynamics-statistic combined method | |
Hu et al. | Uncertainty assessment of estimation of hydrological design values | |
Kumar Singh et al. | Estimation and prediction for Type-I hybrid censored data from generalized Lindley distribution | |
CN105808962A (en) | Assessment method considering voltage probabilities of multiple electric power systems with wind power output randomness | |
CN104795063A (en) | Acoustic model building method based on nonlinear manifold structure of acoustic space | |
CN111311026A (en) | Runoff nonlinear prediction method considering data characteristics, model and correction | |
Irofti et al. | Fault handling in large water networks with online dictionary learning | |
Williams et al. | Importance nested sampling with normalising flows |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |