CN107463993B - Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network - Google Patents

Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network Download PDF

Info

Publication number
CN107463993B
CN107463993B CN201710662894.4A CN201710662894A CN107463993B CN 107463993 B CN107463993 B CN 107463993B CN 201710662894 A CN201710662894 A CN 201710662894A CN 107463993 B CN107463993 B CN 107463993B
Authority
CN
China
Prior art keywords
network
runoff
forecasting
matrix
mutual information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710662894.4A
Other languages
Chinese (zh)
Other versions
CN107463993A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
He Zhiyao
Original Assignee
He Zhiyao
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by He Zhiyao filed Critical He Zhiyao
Priority to CN201710662894.4A priority Critical patent/CN107463993B/en
Publication of CN107463993A publication Critical patent/CN107463993A/en
Application granted granted Critical
Publication of CN107463993B publication Critical patent/CN107463993B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a medium and long term runoff ensemble forecasting method based on mutual information, kernel principal component analysis and an Elman network, which comprises the following steps: collecting meteorological hydrological data, and establishing a one-to-one corresponding relation between an index time sequence and a runoff time sequence; selecting an index which is obvious and has high standard mutual information by adopting a standard mutual information method; extracting principal components of the screened index data by a kernel principal component analysis method; constructing an Elarn neural network model; after the z-score standardizes the main components, dividing a training sample from the main components to perform supervised training on the network, dividing a testing sample to test the network, and calculating each evaluation index value; repeating the single forecast for many times, and taking the average of the multiple forecasts as the final forecast value. The invention can fully excavate the linear and nonlinear relations between meteorological hydrological data and runoff, establish a mathematical relation model and realize more accurate and stable forecasting of medium-term and long-term runoff.

Description

Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network
Technical Field
The invention belongs to the technical field of information, and particularly relates to a medium-long term runoff forecasting method based on mutual information, kernel principal component analysis and Elman network.
Background
Accurate medium and long-term runoff forecasting refers to an important precondition for comprehensive development and utilization, scientific management and optimized scheduling of water guide resources.
At present, the commonly used method for forecasting the medium-and-long-term runoff is a statistical-based method, namely, forecasting is realized by finding out the statistical relationship between a forecasting object and a forecasting factor. The key problem of applying statistical methods to medium and long term runoff forecasting includes the following three aspects:
(1) initial selection of a forecasting factor: for the initial selection of the forecasting factors, the currently common method is a linear correlation analysis method (such as Pearson correlation analysis, Spearman correlation analysis, etc.), that is, a factor with a high correlation coefficient is selected as the forecasting factor by calculating the correlation coefficient between meteorological hydrological data (including remote correlation factor, local correlation factor, etc.) and the historical runoff sequence;
(2) denoising and redundancy removal of the initially selected factors: for noise reduction and redundancy elimination of the initially selected factors, a method commonly used at present is Principal Component Analysis (PCA). When the factors are screened by using the correlation analysis method, the screened factors with high correlation are more often, and different factor time sequences have higher complex collinearity, and the factor time sequences also have certain noise. Therefore, the initially selected factors need to be denoised and de-redundant. The PCA can replace more original indexes with a plurality of less comprehensive indexes, and the less comprehensive indexes can reflect the useful information of more original indexes as much as possible and are orthogonal to each other;
(3) establishing an optimal mathematical relationship between the forecasting object and the forecasting factor: for the establishment of the optimal mathematical relationship between the forecast object and the forecast factor, the currently common models include multiple regression, random forest, artificial neural network, support vector machine and the like.
The existing medium-long term runoff forecasting method based on statistics has the following three problems:
(1) the hydrological process is complex, and besides a linear relation, a certain nonlinear relation also exists between the forecasting factors and the forecasting objects. The linear correlation analysis method for the initial selection of the forecasting factors can only describe the linear relation among the variables and cannot reflect the nonlinear relation among the variables;
(2) PCA for initially factorial noise reduction and redundancy elimination is essentially a linear mapping method, and the obtained principal component is generated by linear mapping. This approach ignores correlations between data above 2 nd order, so the extracted principal component is not optimal;
(3) the model used for establishing the optimal mathematical relationship between the forecast object and the forecast factor is a linear fitting in practice, and the nonlinear relationship between the forecast object and the forecast factor cannot be reflected. Compared with other models, the artificial neural network is widely applied to medium-and-long-term runoff forecasting due to good robustness and strong nonlinear mapping and self-learning capabilities, but the uncertainty of the neural network model parameters can affect the forecasting accuracy to a certain extent, and the forecasting results in each time have a certain difference.
Disclosure of Invention
The invention aims to solve the problems in the traditional statistical forecasting method and provide a medium-long term runoff forecasting method capable of overcoming the problems, so that the stability and the precision of forecasting are improved.
The invention provides a medium and long term runoff forecasting method based on Normalized Mutual Information (NMI), Kernel Principal Component Analysis (KPCA) and an Elman Neural Network (Elman Neural Network), which specifically comprises the following steps:
step 1: data pre-processing
1.1 collecting historical runoff data of an area to be predicted and meteorological hydrological data which can be used as forecasting factors, wherein the commonly used meteorological hydrological data comprise indexes such as atmospheric circulation characteristics, high-altitude air pressure fields, sea surface temperature and the like, 74 circulation characteristic indexes or 130 new atmospheric monitoring indexes provided by the national climate center basically comprise the commonly used indexes, and a preliminary forecasting factor can be directly selected from the indexes.
1.2, considering that the influence of meteorological factors on runoff has hysteresis, establishing a one-to-one correspondence relationship between the index time sequence in the previous 1 year and the runoff time sequence of the area to be predicted. For example, 130 indexes are selected, the forecast target is year runoff in 2017, and the historical runoff data and the historical index data of the prior 1960-2016 month year are monthly. The corresponding relation between one index and the runoff is used for explanation, and the corresponding relations between other indexes and the runoff are the same. The corresponding relationship is as follows:
TABLE 1 corresponding relationship between certain index time series and annual runoff time series
Figure GDA0002698714350000021
Step 2: forecasting factor primary selection based on standard mutual information
2.1 dividing the index time sequence and the annual runoff time sequence into two parts, wherein one part is used as a training sample of the neural network, and the other part is used as a test sample of the trained neural network. For example, the data of the previous 50 years is used as a training sample, and the data of the next 5 years is used as a test sample.
2.2 calculating mutual information. And respectively calculating mutual information of each index time sequence and the corresponding runoff time sequence for the training sample data. Taking the data in table 1 as an example, the mutual information between the 1 st column of the table and the remaining columns of the table is calculated. The formula for the mutual information MI is as follows:
Figure GDA0002698714350000022
wherein X is a runoff time series, and X ═ X1,x2,x3...xn)TTime series with Y as index, Y ═ Y1,y2,y3...yn)TMolecule p (x)i,yj) Is the joint distribution law of X and Y, p (X)i)、p(yj) The edge distribution laws for X and Y, respectively.
2.3 calculating standard mutual information. The mutual information is normalized, i.e. the MI value of step 2.2 is mapped between 0 and 1 using entropy as denominator. The formula for calculating the standard mutual information NMI is as follows:
Figure GDA0002698714350000031
wherein, H (X) and H (Y) are the entropy of X and Y respectively, and the calculation formulas of H (X) and H (Y) are as follows:
Figure GDA0002698714350000032
Figure GDA0002698714350000033
2.4 Standard mutual information Significance Test (Significance Test). The bootstrap method is adopted to carry out the inspection of the standard mutual information, and comprises the following steps:
2.4.1 calculating the standard mutual information NMI value of the original runoff time sequence and the index time sequence;
2.4.2 randomly and simultaneously disordering the sequence of two corresponding time sequences for K times (generally 100 times), calculating the NMI values after disorder and arranging the NMI values in the sequence from big to small;
2.4.3 taking the probability quantile of the NMI in the sequential arrangement as the NMI threshold value corresponding to the significance level of the probability;
2.4.4 if the NMI value of the original time series is greater than the NMI value corresponding to a certain probability threshold (generally 95%), the two sets of data are considered to be significantly correlated.
2.5, selecting the index which passes the significance test and has standard mutual information larger than a certain threshold (generally 0.9, but has difference according to the difference of time sequence length and can be adjusted by self) as the pre-selected forecasting factor.
And step 3: performing kernel principal component analysis to extract principal component
3.1 standardize the primarily selected forecasting factor data z-score, and the calculation formula is as follows:
Figure GDA0002698714350000034
in the formula, y*And d, normalizing the data by z-score, wherein y is one item in the primarily selected forecast factor data, mu is the mean value of the time series where y is located, and sigma is the standard deviation of the time series where y is located.
3.2 calculating the kernel matrix K of the forecasting factors initially selected in the step 2.5. K is an n x n matrix, the element K of the ith row and the jth columni,jThe calculation formula of (a) is as follows:
Figure GDA0002698714350000035
in the formula, the first step is that,
Figure GDA0002698714350000036
is a column vector representing the normalized time series of different initially selected predictor z-score, k is a kernel function, and the following commonly used kernel functions are:
linear Kernel (Linear Kernel):
Figure GDA0002698714350000037
② Polynomial Kernel (Polynomial Kernel):
Figure GDA0002698714350000038
③ Radial Basis Function (Radial Basis Function):
Figure GDA0002698714350000041
sigmoid nucleus (Sigmoid Kernel):
Figure GDA0002698714350000042
b, c, p, u and xi in the formulas (8), (9) and (10) are all constants and are parameters of various kernel functions.
3.3 calculate the centered kernel matrix. K for the core matrix after centralizationcIs represented by KcIs a matrix of n × n, KcThe calculation formula is as follows:
Kc=K-J·K-K·J+J·K·J (11)
in formula (11), J is an n × n matrix, and J has the form:
Figure GDA0002698714350000043
3.4 computing the centered Kernel matrix KcThe characteristic values and the characteristic vectors are arranged according to the sequence from big to small, and the sequence of the characteristic vectors is correspondingly adjusted according to the characteristic values. The eigenvalue matrix obtained after sorting is Λ, the eigenvector matrix is U, and the expression is as follows:
Figure GDA0002698714350000044
3.5 calculate the normalized eigenvector matrix A, which is of the form:
Figure GDA0002698714350000045
wherein
Figure GDA0002698714350000046
3.6 extracting principal components, wherein the principal component matrix is an n multiplied by n square matrix. The first 2 to 3 principal components are typically extracted as predictors. The calculation formula of the ith principal component is as follows:
Figure GDA0002698714350000047
KPC in formulai=(kpci1,kpci2,...,kpcin),KCThe resulting centered kernel matrix is calculated for step 3.2.
And 4, step 4: construction of Elman neural network model
4.1 the Elman network model is constructed, and the network structure (namely the number of nodes of each layer of the network) is determined firstly. The structure of the Elman network is shown in the attached figure 2 of the specification. The method for determining each layer of nodes of the network comprises the following steps: the number of nodes of an Input Layer (Input Layer) is equal to the number of the prediction factors; the number of Output Layer (Output Layer) nodes is equal to the number of forecast objects; the number of nodes of the accepting Layer (Context Layer) is equal to the number of nodes of the Hidden Layer (Hidden Layer); the number of hidden layer nodes has an important influence on the generalization performance of the network, but no system and standard method for determining the number of hidden layer nodes exists at present. One better choice is a trial-and-error method, i.e., the number of hidden layer nodes is determined by observing the network forecast effect by adopting different numbers of hidden layer nodes.
4.2 an Elman network model is constructed, and a training algorithm for determining the network is also needed. The invention adopts a back propagation algorithm and a gradient descent algorithm which drives the quantity items and has self-adaptive learning rate to update the weight of the network. The weight value updating formula is as follows:
Figure GDA0002698714350000051
in the formula, E is a Cost Function (Cost Function), and the Mean Squared Error (MSE) is adopted in the present invention. Omega is weight matrix of Elman neural network, delta omegakIs the change of the weight value at the k-th update, and eta isLearning Rate (Learning Rate), alpha is a Momentum Constant (Momentum Constant), alpha is more than or equal to 0 and less than 1, and alpha is 0.9. The update formula for the learning rate at each iteration is as follows:
η(k)=η(k-1)(1+c·cosθ) (16)
in the formula, c is a constant, and the invention takes 0.2. Theta is the most rapid descending direction
Figure GDA0002698714350000052
The last weight change Δ ωk-1The included angle therebetween.
And 5: single model prediction of runoff
5.1 normalizing the main component factor sequence extracted in the step 3.5 and the historical runoff sequence of the area to be predicted according to the step 2.1, and then dividing the normalized main component factor sequence into a training sample and a test sample, wherein the normalization formula is as follows:
Figure GDA0002698714350000053
wherein z is normalized data, zmax=1,zmin=-1,z∈[-1,1]Q is one of the original runoff sequence or the principal component sequence, qminIs the minimum value in the sequence in which q is located, qmaxIs the maximum value in the sequence in which q is located.
And 5.2, taking the factor data in the training sample as the input of the network, taking the historical runoff data in the training sample as the output of the network, and carrying out supervised learning training on the network.
And 5.3, for the trained network, using the factors in the test sample as the input of the network, and testing the prediction effect of the network. And (5) carrying out reverse normalization on the test result to obtain a predicted runoff value.
5.4 using average Absolute Percentage Error (MAPE), Relative Error (RE), Maximum Relative Error (MRE) and Qualification Rate (QR) as forecast evaluation indexes, the calculation formula of each index is as follows:
Figure GDA0002698714350000061
Figure GDA0002698714350000062
Figure GDA0002698714350000063
in formulae (18), (19) and (20)
Figure GDA0002698714350000064
Run-off value, x, predicted for the test periodiJ is the number of samples tested for the corresponding actual flow value.
Figure GDA0002698714350000065
In the formula (21), TQualFor a qualified number of forecasts, TtotalThe total forecast times. According to the scheme of evaluating the forecasting precision of the medium and long term runoff in hydrologic information forecasting Specification (GI3/T22482-2008), the forecasting method takes forecasting with the maximum relative error smaller than 20% as qualified forecasting.
Step 6: ensemble prediction of runoff
In step 4, the gradient descent search of the Elman network weight space is realized by using a back propagation algorithm, and the error between the actual value of the historical runoff data and the predicted value of the network is reduced in an iterative manner. However, the error surface may contain a plurality of different local minimum values, and during the gradient descent search process of the Elman network weight space, the local minimum value point may be stayed, not necessarily the global minimum value point. Therefore, even though the structure of each trained Elman network is the same, the connection weight parameters of the model are different, which causes the difference between the prediction results of each single Elman network model. In order to reduce the deviation of the prediction result caused by the uncertainty of the model parameters, the invention conducts single-model prediction of the runoff for multiple times, and takes the average value of the multiple prediction results as the final prediction result.
Compared with the prior art, the invention has the advantages that:
(1) the initial selection method of the forecast factors based on the standard mutual information can reflect the linear relation among variables and reflect the nonlinear relation among the variables, the selected factors are more representative, and the defects of the traditional method for screening the factors based on linear correlation analysis are overcome;
(2) kernel Principal Component Analysis (KPCA) is a nonlinear extension of Principal Component Analysis (PCA), i.e., the original vector is mapped to a high-dimensional feature space F by a mapping function phi, and PCA analysis is performed on F. The linear inseparable data in the original space can be almost linearly separable in a high-dimensional feature space, PCA is performed in the high-dimensional space, and the extracted principal components are more representative. Therefore, the feature extraction method based on KPCA greatly improves the processing capacity of nonlinear data, and has more advantages compared with the traditional feature extraction method based on PCA. In addition, the principal components extracted by KPAC are mutually orthogonal, and the data are subjected to noise reduction and redundancy removal, so that overfitting of a neural network can be well prevented, and the generalization capability of the network is improved;
(3) the artificial neural network has good robustness and strong nonlinear mapping and self-learning capabilities, and can well dig the internal relation between the forecasting factors and the forecasting objects. The Elman neural network selected by the invention is a typical dynamic regression network, and compared with a common forward neural network (such as a BP neural network), a carrying layer is additionally arranged. The adaptation layer can record information of the last network iteration and use the information as input of the current iteration, so that the Elman network is more suitable for prediction of time series data. In addition, the neural network has the problem of uncertainty of parameters, and in order to reduce the uncertainty of prediction, a multi-model ensemble prediction method is adopted, so that the prediction precision is improved;
(4) the NMI for factor initial selection, the KPCA for principal component extraction and the Elman neural network for runoff forecasting all have the capacity of processing nonlinear data except linear data, and the three methods are combined together, so that the limitation of the traditional method can be overcome, and the stability and accuracy of forecasting can be improved.
Drawings
FIG. 1 is a general flow diagram of the present invention;
fig. 2 is a block diagram of the Elman network.
Detailed Description
The invention is further elucidated with reference to the drawings and the embodiments.
FIG. 1 is an overall flow chart of the present invention. Taking prediction of annual average runoff of a reservoir of a silk screen first-level hydropower station as an example, the method can be divided into six steps according to a flow chart, and the steps are as follows:
step 1: data pre-processing
1.1 collecting historical runoff data of an area to be predicted and meteorological hydrological data which can be used as a forecasting factor, wherein the commonly used meteorological hydrological data comprise indexes such as atmospheric circulation characteristics, high-altitude air pressure fields, sea surface temperature and the like. The data adopted by the embodiment comprises annual average runoff data of a cross-screen primary hydropower station reservoir section in 1960-2011 and annual 74-item circulation characteristic quantity data in 1959-2010.
1.2, as the annual average runoff is forecasted, factors cannot be selected from the time of the same year, and meanwhile, the influence of meteorological factors on runoff has hysteresis, so that according to the table 1, a one-to-one correspondence relationship between annual average runoff of a silk screen primary hydropower station (1960-2011) and 74 atmospheric circulation indexes of the previous year (1959-2010) is established. The corresponding relation between a certain atmospheric circulation index time series and a runoff time series is shown in a table 2, and other indexes are similar.
TABLE 2 corresponding relationship between a certain atmospheric circulation index time series and a runoff time series
Figure GDA0002698714350000071
Step 2: forecasting factor primary selection based on standard mutual information
2.1 dividing the index time sequence and the annual runoff time sequence into two parts, wherein one part is used as a training sample of the neural network, and the other part is used as a test sample of the trained neural network. This example uses the data of the first 47 years as training samples and the data of the last 5 years as test samples.
2.2 calculate the mutual information MI. And respectively calculating mutual information of the monthly time sequence of each index and the corresponding runoff time sequence of the training sample data. For this example, the mutual information between the annual average runoff series in column 1 of table 2 and the index time series in the remaining columns of the table was calculated according to equation (1). It is noted that to check the reliability of the effect, only the training sample data is used to calculate the mutual information, thereby screening the preliminary predictor. It is checked that sample data should not be added.
2.3 calculate the normalized mutual information NMI, i.e. map the MI values calculated in step 2.2 between 0 and 1 with (2), (3) and (4).
2.4 Significance Test of mutual information (Significance Test). In this embodiment, the bootstrap method is adopted to perform mutual information verification, and includes the following steps:
2.4.1 calculating the standard mutual information NMI value of the original runoff time sequence and the index time sequence;
2.4.2 randomly disordering the sequence of the two time sequences for 100 times, calculating the NMI values after disorder and arranging the NMI values in the sequence from big to small;
2.4.3 taking the probability quantile of the NMI in the sequential arrangement as the NMI threshold value corresponding to the significance level of the probability;
2.4.4, if the NMI value of the original time series is greater than a certain probability threshold (95% is taken in this embodiment), the two groups of data are considered to be significantly related.
2.5, selecting the index which passes the significance test and has the standard mutual information larger than a certain threshold (0.9 is taken in the embodiment) as the forecasting factor of the initial selection. In this embodiment, there are 205 indexes with standard mutual information greater than 0.9, and the information of the first 20 indexes is as follows:
TABLE 3 first 20 Primary selected predictor factors
Factor of initial selection NMI MI
Black sun in 8 months 0.988375 5.426929
Black sun of 4 months 0.988375 5.426929
Sun black for 7 months 0.988375 5.426929
Sun black for 10 months 0.988375 5.426929
12 months old Taiyang black boy 0.988375 5.426929
Sun black for 2 months 0.98444 5.384376
Sun black for 9 months 0.98444 5.384376
11-month-sun black boy 0.98444 5.384376
Sun black for 1 month 0.98444 5.384376
3 month sun black 0.98444 5.384376
Sun black for 5 months 0.98444 5.384376
8 moon northern hemisphere pair high strength index (5E-360) 0.980474 5.341823
3 North moon hemisphere polar vortex area index (5 zone, 0-360) 0.980474 5.341823
North African North American high Strength index of 6 months (110W-60E) 0.976477 5.299270
6-moon northern hemisphere pair high strength index (5E-360) 0.976291 5.256717
4-moon north hemisphere pair high strength index (5E-360) 0.972448 5.256717
North African North American high Strength index of 7 months (110W-60E) 0.972448 5.256717
North African North America of 9 months high Strength index (110W-60E) 0.972448 5.256717
Sun black for 6 months 0.972448 5.256717
Pacific pair high intensity index in 6 months (110E-115W) 0.970919 5.240655
And step 3: and (5) performing kernel principal component analysis, and selecting principal components as forecasting factors. This example selects 205 factor sequences in step 2.5, which often have multiple collinearity between them. The weight matrix of the neural network is increased due to the multiple collinearity forecasting factor, and the training speed and generalization capability of the neural network are directly affected by repeated information and noise, so that feature extraction, noise reduction and redundancy removal are required. In the embodiment, a radial basis kernel function is selected as a kernel function of kernel principal component analysis, principal components are calculated according to formulas (5), (6), (9), (11), (12), (13) and (14), the obtained principal components are arranged according to the descending order of the variance contribution rate values, the variance contribution rates of the first 5 extracted principal components are shown in table 4, and the data of the corresponding first 5 principal layers are shown in table 5.
TABLE 4 variance contribution ratio of first 5 principal components
Principal component Principal component _1 Principal component _2 Principal component _3 Principal component _4 Principal component _5
Variance contribution rate 25.7% 6.9% 5.6% 5.1% 3.9%
TABLE 5 first 5 principal Components extracted by KPCA
Figure GDA0002698714350000091
In this embodiment, a trial-and-error method is used to determine which principal components are selected as the forecasting factors. Repeated tests show that when the first two main components are selected as the forecasting factors, the forecasting effect in the inspection period is the best, and the first two main components are finally determined to be selected as the forecasting factors. It is to be noted that, in order to make the criteria used when the training sample and the test sample extract the principal components consistent, it is necessary to combine the training sample sequence and the test sample sequence together for KPCA. In this embodiment, the length of the training sample sequence is 47, the length of the test sample sequence is 5, and the length of the combination of the sequence sample and the test sample sequence is 52, so that the sequence length of the principal component extracted in table 4 is 52.
And 4, step 4: construction of Elman neural network model
4.1 the Elman network model is constructed, and the network structure (namely the number of nodes of each layer of the network) is determined firstly. The method for determining each layer of nodes of the network comprises the following steps:
(1) the number of nodes of the Input Layer (Input Layer) is equal to the number of the prediction factors. In the embodiment, the first two main components are selected as the forecasting factors, so that the number of nodes of the input layer of the Elman neural network is 2;
(2) the number of output layer nodes is equal to the number of forecast objects, in this embodiment, the annual average runoff is subjected to single value forecast, so the number of output layer nodes is 1;
(3) the number of the carrying layer nodes is equal to the number of the hidden layer nodes;
(4) the number of hidden layer nodes has an important influence on the generalization performance of the network, but no system and standard method for determining the number of hidden layer nodes exists at present. One better choice is a trial-and-error method, i.e., the number of hidden layer nodes is determined by observing the network forecast effect by adopting different numbers of hidden layer nodes. In this embodiment, because the KPCA is used to perform noise reduction and redundancy removal on the factor data in the early stage, and the obtained principal components are orthogonal to each other, so that overfitting of the neural network can be prevented effectively, when the number of nodes in the hidden layer is 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, and 15, respectively, the relative error forecasted in the inspection period is within 20%, and the network is very stable and has good generalization capability. Through repeated experiments, when the number of the hidden layer nodes is 10, the maximum relative error of the forecast in the check period is reduced to 15%, and therefore the number of the hidden layer nodes is determined to be 10.
4.2 an Elman network model is constructed, and a training algorithm for determining the network is also needed. In the embodiment, the weight of the network is updated by adopting a back propagation algorithm and a gradient descent algorithm which drives the vector items and has a self-adaptive learning rate. The weight value updating formula is shown in formula (15) and formula (16).
And 5: single model prediction of runoff
5.1 according to the step 2.1, normalizing the main component factor sequence extracted in the step 3.5 and the historical runoff sequence of the area to be predicted according to the formula (1), and then dividing into a training sample and a test sample. In this embodiment, the data of the two principal component sequences selected in step 3 and the data of the golden screen primary hydropower station in the first 47 years are used as training samples, and the data of the last 5 years are used as test samples.
And 5.2, taking the factor data in the training sample as the input of the network, taking the historical runoff data in the training sample as the output of the network, and carrying out supervised learning training on the network. The learning process can be summarized as follows:
(1) and initializing connection weight coefficients between each layer of the network by adopting a random Function, and determining an error allowed by a Cost Function (Cost Function). The cost function of this embodiment adopts Mean Squared Error (MSE);
(2) inputting a learning sample to the network, calculating a value E of a mean square error function by combining an algorithm, and updating a connection weight value between each layer of the network according to the E;
(3) and (5) when the value of E is larger than the value of E, turning to the step (2), otherwise, finishing learning, and calculating the network output.
And 5.3, for the trained network, using the factor data in the test sample as the input of the network to test the prediction effect of the network. And (5) carrying out reverse normalization on the test result to obtain a predicted runoff value.
5.4, the average Absolute Percentage Error (MAPE), the large Relative Error (MRE) and the Qualification Rate (QR) are used as forecast evaluation indexes, and all the indexes are calculated according to the formulas (17), (18), (19) and (20). in order to verify the generalization ability and the forecast stability of the network model in the invention, the embodiment carries out 100 times of single model forecast, and the result shows that the Maximum Relative Error forecasted in each inspection period is within 16 percent, and the qualification Rate reaches 100 percent, thereby showing that the network model used in the invention has good generalization ability and forecast stability, wherein the Error statistics of the forecast in the previous 5 inspection periods is shown in Table 6.
TABLE 6 Single model test period prediction error statistics
Figure GDA0002698714350000101
Step 6: ensemble prediction of runoff
In order to reduce the deviation of the prediction result caused by the uncertainty of the model parameters, the method carries out single model prediction of the runoff for multiple times, and takes the average value of the multiple prediction results as the final prediction result. In this embodiment, the average of the results of 100 forecasts can be used as the final forecast result.
The above description is only an example of the present invention and is not intended to limit the present invention. All equivalents which come within the spirit of the invention are therefore intended to be embraced therein. Details not described herein are well within the skill of those in the art.

Claims (2)

1. A method for forecasting medium and long term runoff based on mutual information-kernel principal component analysis-Elman network comprises mutual information, kernel principal component analysis and Elman neural network, and is characterized by comprising the following steps:
s1, constructing a forecasting factor matrix, wherein the forecasting factors adopt 74 circulation characteristic indexes or 130 new atmospheric monitoring indexes, and the 130 new atmospheric monitoring indexes comprise 88 monthly atmospheric circulation indexes, 26 monthly sea temperature indexes and 16 monthly other indexes; the forecasting factors are divided according to a month scale, and a 74 x 12 or 130 x 12 forecasting factor matrix is established by considering that the lag time t is 1-12 months;
s2, calculating the mutual information of the forecasting factor matrix and the runoff time sequence by adopting a mutual information method, which comprises the following steps:
s201, calculating the mutual information of the forecasting factor matrix and the runoff time sequence, and specifically adopting the following formula to calculate:
Figure FDA0002698118270000011
wherein X is a runoff time series, and X ═ X1,x2,x3...xn)TTime series with Y as index, Y ═ Y1,y2,y3...yn)TMolecule p (x)i,yj) Is the joint distribution law of X and Y, p (X)i)、p(yj) Edge distribution laws of X and Y, respectively;
s202, after mutual information is calculated, the mutual information value is mapped between 0 and 1, and the following formula is specifically adopted for calculation:
Figure FDA0002698118270000012
Figure FDA0002698118270000013
wherein H (X) and H (Y) are the entropy of X and Y respectively, and H (X) is similar to H (Y) in calculation formula;
s203, checking the calculated standard mutual information by adopting a bootstrap method, randomly and simultaneously disordering two corresponding time sequence sequences for K times, calculating NMI values after each disordering, arranging the NMI values in a descending order, taking the probability quantiles of the NMI values arranged in sequence as NMI thresholds corresponding to the probability significance level, if the original time sequence NMI value is greater than the NMI value corresponding to a certain probability threshold, considering that the two groups of data are significantly related, and selecting an index which passes significance checking and has the standard mutual information greater than a certain threshold as a pre-selection forecasting factor;
s3, performing dimensionality reduction on the primarily selected forecasting factors by using kernel principal component analysis, and extracting principal components, wherein the method specifically comprises the following steps:
s301, standardizing the primarily selected forecasting factor data z-score, and specifically adopting the following formula to calculate:
Figure FDA0002698118270000014
in the formula, y*The data is normalized by z-score, y is one item in the primarily selected forecast factor data, mu is the mean value of the time series corresponding to y, and sigma is the standard deviation of the time series corresponding to y;
s302, calculating a kernel matrix K of the primarily selected forecasting factors, wherein K is an n multiplied by n matrix, and an element K of the ith row and the jth columni,jThe calculation formula of (a) is as follows:
Figure FDA0002698118270000021
in the formula (I), the compound is shown in the specification,
Figure FDA0002698118270000022
is a column vector representing the normalized time series of different initially selected predictor z-score, k is a kernel function, and the optional kernel function has a linear kernel function
Figure FDA0002698118270000023
Polynomial kernel function
Figure FDA0002698118270000024
Radial basis kernel function
Figure FDA0002698118270000025
Sigmoid kernel function
Figure FDA0002698118270000026
B, c, p, upsilon and ξ in the kernel function formula are constants and are parameters of various kernel functions;
s303, calculating a centralized kernel matrix Kc,KcIs a matrix of n × n, KcThe calculation formula is as follows:
Kc=K-J·K-K·J+J·K·J
in the formula (I), the compound is shown in the specification,
Figure FDA0002698118270000027
j is a matrix of n × n;
s304, calculating a core matrix K after centralizationcThe eigenvalues and the eigenvectors are arranged according to the sequence from big to small, the sequence of the eigenvectors is correspondingly adjusted according to the eigenvalues, and the eigenvalue matrix obtained after the sequencing is
Figure FDA0002698118270000028
The eigenvector matrix is
Figure FDA0002698118270000029
S305, calculating a normalized feature vector matrix
Figure FDA00026981182700000210
Wherein
Figure FDA00026981182700000211
S306, extracting principal components, wherein the principal component matrix is an n multiplied by n square matrix, the first 2 to 3 principal components are extracted as forecasting factors, and the calculation formula of the ith principal component is as follows:
Figure FDA00026981182700000212
in the formula, KPCi=(kpci1,kpci2,...,kpcin),KCFor the centered kernel matrix calculated in step S303, i is 1, 2.. n;
s4, constructing an Elman network model, determining the structure parameters of the network and a training algorithm, wherein the structure parameters comprise the number of input layer nodes and the number of hidden layer nodes, and the method specifically comprises the following steps:
s401, arranging the obtained principal components in a sequence from large variance contribution rate to small variance contribution rate, combining different principal components and the number of nodes of the hidden layer by adopting a trial-and-error method, and determining the optimal number of the principal components and the nodes of the hidden layer according to the forecast error in the inspection period;
s402, adopting a back propagation algorithm and a gradient descent algorithm with a driving quantity item and a self-adaptive learning rate to update the weight of the network, wherein the specific formula is as follows:
Figure FDA0002698118270000031
η(k)=η(k-1)(1+c·cosθ)
in the formula, E is a cost function, a mean square error function is adopted, omega is a weight matrix of the Elman neural network, and delta omega iskIs the change of weight value at kth update, eta is learning rate, alpha is momentum constant, alpha is more than or equal to 0 and less than 1, 0.9 is taken, c is constant, 0.2 is taken, and theta is the most rapid descending direction
Figure FDA0002698118270000032
The last weight change Δ ωk-1The included angle therebetween;
s5, for the model with determined structure parameters and training algorithm, training the model for many times by using the selected principal components, and performing ensemble prediction, specifically comprising the following steps:
s501, normalizing the selected main component factor sequence and the historical runoff sequence of the area to be predicted, and then dividing the sequences into a training sample and a test sample, wherein the normalization formula is as follows:
Figure FDA0002698118270000033
wherein z is normalized data, zmax=1,zmin=-1,z∈[-1,1]Q is one of the original runoff sequence or the principal component sequence, qminIs the minimum value in the sequence in which q is located, qmaxIs the maximum value in the sequence where q is located;
s502, taking factor data in a training sample as input of a network, taking historical runoff data in the training sample as output of the network, and performing supervised learning training on the network;
s503, for the trained network, using factors in the test sample as the input of the network, performing inverse normalization on the output result of the network to obtain a predicted runoff value, then evaluating the predicted value, and taking the forecast with the maximum forecast error less than 20% as a qualified forecast according to a scheme for evaluating the medium-and long-term runoff forecast accuracy in hydrological information forecast Specification (GI 3/T22482-2008);
and S504, repeating the step S502 and the step S503, training a plurality of models, and averaging the predicted values of the plurality of models to obtain the value of ensemble prediction.
2. The method for forecasting the medium and long term runoff based on the mutual information-kernel principal component analysis-Elman network as claimed in claim 1, wherein in S503, the predicted value is evaluated, the evaluated indexes comprise the average absolute percentage error MAPE, the relative error RE, the maximum relative error MRE and the qualification rate QR, and the calculation formula of each index is as follows:
Figure FDA0002698118270000041
Figure FDA0002698118270000042
Figure FDA0002698118270000043
Figure FDA0002698118270000044
in the formula (I), the compound is shown in the specification,
Figure FDA0002698118270000045
predicted path for test periodFlow value, xiIs the corresponding actual runoff value, j is the number of samples tested, TQualFor a qualified number of forecasts, TtotalThe total number of forecasts.
CN201710662894.4A 2017-08-04 2017-08-04 Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network Active CN107463993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710662894.4A CN107463993B (en) 2017-08-04 2017-08-04 Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710662894.4A CN107463993B (en) 2017-08-04 2017-08-04 Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network

Publications (2)

Publication Number Publication Date
CN107463993A CN107463993A (en) 2017-12-12
CN107463993B true CN107463993B (en) 2020-11-24

Family

ID=60547269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710662894.4A Active CN107463993B (en) 2017-08-04 2017-08-04 Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network

Country Status (1)

Country Link
CN (1) CN107463993B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108537679B (en) * 2018-02-08 2022-04-12 中国农业大学 Remote sensing and crop model fused region scale crop emergence date estimation method
CN109492825A (en) * 2018-11-26 2019-03-19 中国水利水电科学研究院 Medium-long Term Prediction method based on mutual information and the principal component analysis screening factor
CN109671507A (en) * 2018-12-24 2019-04-23 万达信息股份有限公司 A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record
CN110334546B (en) * 2019-07-08 2021-11-23 辽宁工业大学 Difference privacy high-dimensional data release protection method based on principal component analysis optimization
CN110852497A (en) * 2019-10-30 2020-02-28 南京智慧航空研究院有限公司 Scene variable slide-out time prediction system based on big data deep learning
CN112766531B (en) * 2019-11-06 2023-10-31 中国科学院国家空间科学中心 Runoff prediction system and method based on satellite microwave observation data
CN111310968B (en) * 2019-12-20 2024-02-09 西安电子科技大学 LSTM neural network circulating hydrologic forecasting method based on mutual information
CN111445085A (en) * 2020-04-13 2020-07-24 中国水利水电科学研究院 Medium-and-long-term runoff forecasting method considering influence of medium-and-large-sized reservoir engineering water storage
CN117114523B (en) * 2023-10-23 2024-02-02 长江三峡集团实业发展(北京)有限公司 Runoff forecasting model construction and runoff forecasting method based on condition mutual information
CN117132176B (en) * 2023-10-23 2024-01-26 长江三峡集团实业发展(北京)有限公司 Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463358A (en) * 2014-11-28 2015-03-25 大连理工大学 Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7223234B2 (en) * 2004-07-10 2007-05-29 Monitrix, Inc. Apparatus for determining association variables
CN102122370A (en) * 2011-03-07 2011-07-13 北京师范大学 Method for predicting river basin climatic change and analyzing tendency
CN103942728B (en) * 2014-04-11 2017-02-08 武汉大学 Cascade hydropower station group daily power generation plan making method
CN104008164A (en) * 2014-05-29 2014-08-27 华东师范大学 Generalized regression neural network based short-term diarrhea multi-step prediction method
CN104091074B (en) * 2014-07-12 2017-10-10 长安大学 A kind of MEDIUM OR LONG RANGE HYDROLOGIC FORECAST METHOD based on empirical mode decomposition
CN104951847A (en) * 2014-12-31 2015-09-30 广西师范学院 Rainfall forecast method based on kernel principal component analysis and gene expression programming
CN104869126B (en) * 2015-06-19 2018-02-09 中国人民解放军61599部队计算所 A kind of network intrusions method for detecting abnormality
CN105139093B (en) * 2015-09-07 2019-05-31 河海大学 Flood Forecasting Method based on Boosting algorithm and support vector machines
CN105354416B (en) * 2015-10-26 2018-04-24 南京南瑞集团公司 It is a kind of based on the Basin Rainfall runoff electricity macro-forecast method for representing power station
CN105678422A (en) * 2016-01-11 2016-06-15 广东工业大学 Empirical mode neural network-based chaotic time series prediction method
CN106845371B (en) * 2016-12-31 2019-10-25 中国科学技术大学 A kind of city road network automotive emission remote sensing monitoring system
CN106971237B (en) * 2017-02-27 2018-04-24 中国水利水电科学研究院 A kind of Medium-and Long-Term Runoff Forecasting method for optimization algorithm of being looked for food based on bacterium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463358A (en) * 2014-11-28 2015-03-25 大连理工大学 Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Flood stage forecasting with SVM;Liong S Y等;《Journal of the American Water Resources Association》;20021231;第38卷(第1期);第173-186页 *
基于主成分分析的三种中长期预报模型在柘溪水库的应用;李薇等;《水力发电》;20160912;第42卷(第9期);第17-21页 *

Also Published As

Publication number Publication date
CN107463993A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
CN107463993B (en) Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN110909926A (en) TCN-LSTM-based solar photovoltaic power generation prediction method
CN111563706A (en) Multivariable logistics freight volume prediction method based on LSTM network
CN111310968A (en) LSTM neural network circulation hydrological forecasting method based on mutual information
CN111062508B (en) Method for evaluating real-time running state of wind turbine generator based on big data technology
CN109766583A (en) Based on no label, unbalanced, initial value uncertain data aero-engine service life prediction technique
CN113297787B (en) Method for predicting remaining life of aircraft engine based on transfer learning
CN111401599B (en) Water level prediction method based on similarity search and LSTM neural network
CN110083125B (en) Machine tool thermal error modeling method based on deep learning
CN114254695B (en) Spacecraft telemetry data self-adaptive anomaly detection method and device
CN109284662B (en) Underwater sound signal classification method based on transfer learning
CN113255900A (en) Impulse load prediction method considering improved spectral clustering and Bi-LSTM neural network
CN114266289A (en) Complex equipment health state assessment method
CN113988415B (en) Medium-and-long-term power load prediction method
CN111863153A (en) Method for predicting total amount of suspended solids in wastewater based on data mining
CN111861002A (en) Building cold and hot load prediction method based on data-driven Gaussian learning technology
CN113723707A (en) Medium-and-long-term runoff trend prediction method based on deep learning model
Jiang et al. SRGM decision model considering cost-reliability
CN115081856A (en) Enterprise knowledge management performance evaluation device and method
CN113435321A (en) Method, system and equipment for evaluating state of main shaft bearing and readable storage medium
CN111160419B (en) Deep learning-based electronic transformer data classification prediction method and device
CN113869990A (en) Shop lease pricing method based on space-time representation learning model
Zuhri et al. Probability Prediction for Graduate Admission Using CNN-LSTM Hybrid Algorithm
CN109902762A (en) The data preprocessing method deviateed based on 1/2 similarity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant