Water supply amount prediction method based on ARIMA-LSTM combined model
Technical Field
The invention belongs to the field of water supply prediction, and particularly relates to a water supply prediction method based on an ARIMA-LSTM combined model.
Background
The water supply prediction is used as the basis of decision planning of water supply units and provides prerequisites for the operation, management and optimization of water supply systems involved in the decision planning. The existing prediction method and model can provide great help for water demand prediction, but the application of the prediction method and model to water demand prediction of water supply units is full of various problems. For example: the accuracy of the existing data, numerous variables affecting the water demand prediction, the prediction range involved, the diversity of the prediction period, and the like all affect the reliability of the water demand prediction result. Water supply at a given time in the future is often an important and close link to water supply in the past, and therefore a great deal of research has taken advantage of this important link to make accurate and reliable predictions of municipal water supply.
In the context of water supply prediction, many researchers have proposed various prediction methods, which can be roughly classified into conventional methods and novel methods. Early studies attempted to solve the municipal water supply prediction problem using traditional statistical models, such as linear regression models and time series models. In the water supply time series model, a general model is an integrated autoregressive moving average (ARIMA) model, a seasonal autoregressive moving average (SARIMA) model, or the like. Since such linear models are easy to understand and implement, they have been the focus of research and have been widely used in practice.
However, in practice the variation in water supply is a combination of linear and non-linear and is both regular and random, making water supply prediction a challenging task. The nonlinear model can have better processing capability on the nonlinear part in the water supply time sequence, wherein a learning algorithm (such as machine learning and artificial intelligence) belongs to a nonlinear method, advanced data analysis is used, so that the learning algorithm model can effectively learn valuable information from water supply data and realize high-precision prediction. Long-short term memory neural networks (LSTM) are a popular machine learning method commonly used for water supply prediction, and the effectiveness of this method has been verified in many studies.
However, the single prediction model has a certain limitation on the prediction accuracy, and because the single model only extracts and predicts part of effective information in the water supply data from the information utilization perspective, the rule of the prediction data sequence can be described from only one information level. In view of the limitations of single models, researchers propose a combined predictive model concept. The combined model, namely different prediction models are combined properly, and the advantages of the respective models are utilized to extract different characteristic information of the predicted water supply data, so that the water supply data is comprehensively and fully utilized to a certain extent.
In conclusion, the method develops the urban short-term water supply prediction research based on the ARIMA-LSTM combined model.
Disclosure of Invention
The invention provides a water supply amount prediction method based on an ARIMA-LSTM combined model, which greatly improves the prediction precision, reduces the prediction error, has good prediction effect under the condition of more data sets, and can provide decision basis and technical support for urban water supply amount prediction.
A water supply amount prediction method based on an ARIMA-LSTM combined model. The method comprises the following steps:
(1) and (4) preprocessing the original water supply amount data. Preprocessing the original water supply time sequence based on a data preprocessing technology: carrying out stabilization and missing value treatment on the water supply time sequence;
(2) prediction by an ARIMA model. Predicting the linear part of the time sequence by using an ARIMA model;
(3) prediction by LSTM model. The LSTM model is used for error correction of the ARIMA model, namely, the nonlinear part of the ARIMA model is predicted, so that the final prediction result of the ARIMA model is obtained;
(4) ARIMA-LSTM combined model prediction. And respectively carrying out parallel weighted combination on the final prediction result of the ARIMA model and the prediction result of the LSTM model by using fixed weight to obtain the prediction result of the ARIMA-LSTM combined model.
In the step (1), firstly, decomposing the time sequence according to the characteristics of the original time sequence; secondly, the non-stationary part of the original time sequence is stabilized by using a traditional difference algorithm and a seasonal difference algorithm, and finally, the missing data value is further processed by using an interpolation method.
The specific steps of constructing the ARIMA model in the step (2) are as follows, 1) judging the stationarity of a time sequence; 2) scaling the model; 3) estimating model parameters; 4) further optimizing and selecting the model; 5) significance (white noise) test of the model; 6) the proposed model is predicted.
The specific steps of constructing the LSTM model in the step (3) are as follows, 1) dividing a verification set; 2) preprocessing data; 3) setting parameters; 4) training a model; 5) evaluating the model; 6) and (5) optimal model prediction.
And (4) the weighting coefficient is obtained by using a minimum criterion of combined prediction error. The specific solving method is as follows: assuming that n different single prediction models are used for prediction analysis of urban water supply, the combined prediction model expression is as follows:
in the formula (f)
tA prediction value for the combined prediction model; lambda [ alpha ]
iIs the weighting coefficient of the ith prediction model,
x
itis the predicted value of the ith model at the moment t.
The prediction error of the combined prediction model can be expressed as:
in the formula (2), etA prediction error value for the combined prediction model; y istIs the measured value of the water supply amount; e.g. of the typeitThe prediction error value of the ith prediction model at the time t.
The square value of the prediction error of the combined prediction model is:
the sum of the squares of the prediction errors of the combined prediction model is set as:
if the unit column vector is I
n=[1,1,…,1]
TConstraint condition of weighting coefficient
Can be written as:
if the prediction error variance of the combined prediction model is to be minimized, then the minimum value is determined under the constraint condition:
the two different models (ARIMA corrected model and LSTM model) are weighted and combined, so the weighted coefficients and the equation for the sum of the squares of the prediction errors for the optimal combined prediction method are as follows:
1. solving formula of weighting coefficient:
2. the prediction error sum of squares calculation formula:
the invention has the beneficial effects that: and obtaining a combined prediction model by carrying out parallel weighting on the ARIMA model and the LSTM model. The method can effectively improve the prediction precision, reduce the prediction error and have good prediction effect under the condition of more data sets, thereby providing decision basis and technical support for urban water supply prediction.
Drawings
FIG. 1 is a flow chart of ARIMA-LSTM combined model
FIG. 2 is a diagram of ARIMA-LSTM combined model prediction results
FIG. 3 is a diagram of ARIMA-LSTM combined model prediction relative error
Detailed Description
For a more clear understanding of the present invention, the following detailed description of the invention is given in conjunction with the actual cases.
As shown in FIG. 1, the invention relates to a water supply amount prediction method based on an ARIMA-LSTM combined model, which comprises the following steps:
(1) and (4) preprocessing the original water supply amount data. Preprocessing the original water supply time sequence based on a data preprocessing technology: and (4) carrying out stabilization and missing value treatment on the water supply time series. Firstly, decomposing the original time sequence according to the characteristics of the original time sequence; secondly, the non-stationary part of the original time sequence is stabilized by using a traditional difference algorithm and a seasonal difference algorithm, and finally, the missing data value is further processed by using an interpolation method.
(2) Prediction by an ARIMA model. Linear portions of the time series were predicted using the ARIMA model. The specific steps of constructing the ARIMA model comprise 1) judging the stationarity of a time sequence; 2) scaling the model; 3) estimating model parameters; 4) further optimizing and selecting the model; 5) significance (white noise) test of the model; 6) the proposed model is predicted.
(3) Prediction by LSTM model. The LSTM model is used for error correction of the ARIMA model, namely, the nonlinear part of the ARIMA model is predicted, so that the final prediction result of the ARIMA model is obtained. The specific steps of constructing the LSTM model comprise 1) dividing a verification set; 2) preprocessing data; 3) setting parameters; 4) training a model; 5) evaluating the model; 6) and (5) optimal model prediction.
(4) ARIMA-LSTM combined model prediction. And respectively carrying out parallel weighted combination on the final prediction result of the ARIMA model and the prediction result of the LSTM model by using fixed weight to obtain the prediction result of the ARIMA-LSTM combined model. Wherein the weighting coefficients are derived from a combined prediction error minimization criterion. The specific solving method is as follows: assuming that n different single prediction models are used for prediction analysis of urban water supply, the combined prediction model expression is as follows:
in the formula (f)
tA prediction value for the combined prediction model; lambda [ alpha ]
iIs the weighting coefficient of the ith prediction model,
x
itis the predicted value of the ith model at the moment t.
The prediction error of the combined prediction model can be expressed as:
in the formula (2), etA prediction error value for the combined prediction model; y istIs the measured value of the water supply amount; e.g. of the typeitThe prediction error value of the ith prediction model at the time t.
The square value of the prediction error of the combined prediction model is:
the sum of the squares of the prediction errors of the combined prediction model is set as:
if the unit column vector is I
n=[1,1,…,1]
TConstraint condition of weighting coefficient
Can be written as:
if the prediction error variance of the combined prediction model is to be minimized, then the minimum value is determined under the constraint condition:
the two different models (ARIMA corrected model and LSTM model) are weighted and combined, so the weighted coefficients and the equation for the sum of the squares of the prediction errors for the optimal combined prediction method are as follows:
1. solving formula of weighting coefficient:
2. the prediction error sum of squares calculation formula: