CN107463993A - Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks - Google Patents
Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks Download PDFInfo
- Publication number
- CN107463993A CN107463993A CN201710662894.4A CN201710662894A CN107463993A CN 107463993 A CN107463993 A CN 107463993A CN 201710662894 A CN201710662894 A CN 201710662894A CN 107463993 A CN107463993 A CN 107463993A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- mutual information
- forecast
- component analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Abstract
The invention discloses a kind of medium-term and long-term runoff DATA PROCESSING IN ENSEMBLE PREDICTION SYSTEM method based on mutual information, core principle component analysis and Elman networks, comprise the following steps:Meteorological model data is collected, establishes index time series and flow-through period sequence one-to-one relationship;Notable and high standard mutual information index is selected using the method for standard mutual information;With the principal component of the achievement data of core principle component analysis method extraction screening;Build Elamn neural network models;After z score standardization principal components, therefrom mark off training sample and network is exercised supervision training, mark off test samples and network is tested, calculate each evaluation index value;It is multiple to repeat single forecast, takes the ensemble average repeatedly forecast to make final predicted value.The present invention can fully excavate linear, non-linear relation between meteorological model data and runoff, and establish numerical relationship model, realize the forecast that centering long-period runoff amount is more accurate, stable.
Description
Technical field
It is more particularly to a kind of to be based on mutual information-core principle component analysis-Elman networks the invention belongs to areas of information technology
Medium-and Long-Term Runoff Forecasting method.
Background technology
Accurately medium-term and long-term run-off forecast refers to the weight for leading water resources integrative planning, scientific management and Optimized Operation
Want premise.
At present, the method for conventional Medium-and Long-Term Runoff Forecasting is to be based on statistical method, i.e., by finding Forecasting Object
With the statistical relationship of predictor, forecast is realized.Statistical method is used for the key issue of Medium-and Long-Term Runoff Forecasting to be included
Three aspect below:
(1) primary election of predictor:Primary election for predictor, currently used method are Linear correlative analysis methods
(such as Pearson correlation analyses, Spearman correlation analyses), i.e., by calculate meteorological model data (including distant correlation factor,
Locally associated factor etc.) coefficient correlation between history Inflow Sequence, the high factor of coefficient correlation is selected as predictor;
(2) noise reduction and de-redundancy for the factor come are selected:Noise reduction and de-redundancy for selecting the factor come, at present
Conventional method is PCA (Principal Component Analysis, PCA).Due to correlation analysis side
When method screens the factor, the factor of the high correlation filtered out is often more, and exists between different factor time serieses higher
Multi-collinearity, there is also certain noise in itself for factor time series.Therefore, it is necessary to being dropped to selecting the factor come
Make an uproar and de-redundancy.PCA can be with several less overall targets come instead of original more index, and these less overall targets
It can not only reflect more originally compared with the useful information of multi objective to the greatest extent, and it is orthogonal between each other;
(3) foundation of the optimal mathematical relationship between Forecasting Object and predictor:For Forecasting Object and predictor
Between optimal mathematical relationship foundation, currently used model has multiple regression, random forest, artificial neural network, support
Vector machine etc..
The problem of following three aspect be present based on statistical Medium-and Long-Term Runoff Forecasting method in existing:
(1) hydrologic process is complicated, necessarily non-thread between predictor and Forecasting Object in addition to linear relationship also be present
Sexual intercourse.Linear correlative analysis method for predictor primary election can only describe the linear relationship between variable, it is impossible to reflect variable
Between non-linear relation;
(2) it is used for the PCA of primary election factor noise reduction and de-redundancy, is substantially a kind of Linear Mapping method, obtained principal component
It is to be generated by Linear Mapping.This method have ignored the correlation for being higher than 2 ranks between data, so the principal component extracted is simultaneously
It is not optimal;
(3) it is used for the model for establishing optimal mathematical relationship between Forecasting Object and predictor, conventional multiple regression is real
It is also a kind of linear fit on border, it is impossible to the non-linear relation reflected between Forecasting Object and predictor.With other model phases
Than, artificial neural network because robustness is good, Nonlinear Mapping and self-learning capability are strong, obtained in Medium-and Long-Term Runoff Forecasting compared with
To be widely applied, but the uncertainty of neural network model parameter can affect to accuracy of the forecast, every time
There can be the difference of certain amplitude between the result of forecast.
The content of the invention
The purpose of the present invention is to be directed to problem present in traditional statistical method, there is provided one kind can overcome these
The method of the Medium-and Long-Term Runoff Forecasting of problem, so as to improve the stability of forecast and precision.
It is provided by the invention based on standard mutual information (Normalized Mutual Information, NMI), core it is main into
Analysis (Kernel Principal Component Analysis, KPCA) and Elman neutral nets (Elman Neural
Network Medium-and Long-Term Runoff Forecasting method), specifically includes following steps:
Step 1:Data prediction
1.1 collect regional history footpath flow datas to be predicted and have can be as the meteorological model data of predictor, often
Meteorological model data includes index, the National Climate centers such as Atmospheric Circulation Characteristics, high-altitude field of pressure and sea surface temperature and carried
The 74 Circulation Features indexes or new 130 atmospheric monitoring indexes supplied essentially comprising these conventional indexs, can be direct
Preliminary predictor is selected from these index numbers.
1.2 have hysteresis quality in view of influence of the meteorological factor to runoff, and the index time series before foundation in 1 year is with treating
Predict the one-to-one relationship of regional flow-through period sequence.For example, having selected 130 indexs, Forecasting Object is the year footpath of 2017
Stream, the history footpath flow datas and history achievement data of existing 1960-2016 month by month.It is corresponding with runoff with one of index
Relation illustrates, and other indexs are identical with runoff corresponding relation.Corresponding relation is as follows:
Certain the index time series of table 1 and annual flow time series corresponding relation
Step 2:Predictor primary election based on standard mutual information
Index time series and annual flow time series are divided into two parts, training of the part as neural networks by 2.1
Sample, test samples of the another part as trained neutral net.For example, the data of former 50 years are as training sample
This, the data of latter 5 years are as test samples.
2.2 calculate mutual information.To training sample data, when calculating each index time series respectively with corresponding runoff
Between sequence mutual information.With the data instance in table 1, i.e., the mutual information that the row of computation sheet the 1st respectively arrange with form residue.Mutual information
MI calculation formula is as follows:
Wherein, X is flow-through period sequence, X=(x1,x2,x3...xn)T, Y be index time series, Y=(y1,y2,
y3...yn)T, molecule p (xi,yj) be X and Y joint distribution principle, p (xi)、p(yj) be respectively X and Y edge distribution rule.
2.3 calculate standard mutual information.Normalised mutual information, i.e., do denominator with entropy and the MI values of step 2.2 are mapped to 0 and 1
Between.Standard mutual information NMI calculation formula is as follows:
Wherein, H (X) and H (Y) is respectively X and Y entropy, and H (X) and H (Y) calculation formula are as follows:
The significance test (Significance Test) of 2.4 standard mutual informations.Standard mutual information is carried out using boot strap
Inspection, comprise the following steps:
2.4.1 the standard mutual information NMI values of former flow-through period sequence and index time series are calculated;
2.4.2 random order K times (typically taking 100 times) for upsetting two corresponding time serieses simultaneously, calculates out of order rear NMI values
And arranged by descending order;
2.4.3 take order arrangement NMI probability quantile as to should probability significance NMI threshold values;
2.4.4 if former time series NMI values are more than NMI values corresponding to certain probability threshold value (typically taking 95%), then it is assumed that this
Two groups of data are significantly correlated.
2.5 select and are more than a certain threshold value by significance test and standard mutual information and (typically take 0.9, but according to time sequence
The difference of row length can be variant, can voluntarily adjust) predictor of the index as primary election.
Step 3:Core principle component analysis is carried out, extracts principal component
3.1 standardize the predictor data z-score of primary election, and calculation formula is as follows:
In formula, y*Data after being standardized for z-score, y are one in the predictor data of primary election, and μ is y institutes
The average of the time series at place, σ are the standard deviation of the time series residing for y.
The nuclear matrix K of the predictor of primary election in 3.2 calculation procedures 2.5.K is n × n matrix, the member that the i-th row jth arranges
Plain Ki,jCalculation formula it is as follows:
In formula,It is column vector, represents the time sequence after the predictor z-score standardization of different primary election
Row, k is kernel function, and conventional kernel function has following several:
1. linear kernel (Linear Kernel):
2. polynomial kernel (Polynomial Kernel):
3. Radial basis kernel function (Radial Basis Function):
4. Sigmoid cores (Sigmoid Kernel):
Formula (8), (9) and b, c, p, δ, υ, ξ in (10) are constant, are the parameters of various kernel functions.
3.3 calculate the nuclear matrix of centralization.Nuclear matrix K after centralizationcRepresent, KcFor n × n matrix, KcMeter
It is as follows to calculate formula:
Kc=K-JK-KJ+JKJ (11)
J is n × n matrix in formula (11), and J form is as follows:
3.4 calculate the nuclear matrix K after centralizationcEigen vector, and characteristic value according to descending
Order arranges, and the order of characteristic vector does corresponding adjustment according to characteristic value.The eigenvalue matrix obtained after sequence is Λ, feature
Vector matrix is U, is represented as follows:
3.5 calculate normalized eigenvectors matrix A, and A form is as follows:
Wherein
3.6 extraction principal components, principal component matrix are n × n square formation.Before general extraction 2 to 3 principal components as forecast because
Son.The calculation formula of i-th of principal component is as follows:
KPC in formulai=(kpci1,kpci2,...,kpcin), KCThe nuclear matrix for the centralization being calculated for step 3.2.
Step 4:Build Elman neural network models
4.1 structure Elman network models, need to determine network structure (i.e. the nodes of each layer of network) first.Elman networks
Structure chart is shown in Figure of description 2.The method for determining each node layer of network is as follows:The node number of input layer (Input Layer)
Equal to the number of predictive factor;Output layer (Output Layer) nodes are equal to the number of Forecasting Object;Accept layer
(Context Layer) nodes are equal to hidden layer (Hidden Layer) nodes;Node in hidden layer is for the general of network
Change performance to have a major impact, but there is presently no the method for a system and standard to determine node in hidden layer.One ratio
Preferably selection is exactly trial-and-error method, i.e., by using different node in hidden layer, observes the value of forecasting of network, so that it is determined that
The number of hidden layer node.
4.2 structure Elman network models, it is also necessary to determine the training algorithm of network.The present invention uses back-propagation algorithm
With the weights with momentum term and the adaptive gradient descent algorithm more row network of learning rate.Right value update formula is as follows:
In formula, E is cost function (Cost Function), and the present invention uses mean square error function (Mean
Squared Error,MSE).ω be Elman neutral nets weight matrix, Δ ωkThe change of weights when being updated for kth time
Amount, η are learning rate (Learning Rate), and α is momentum constant (Momentum Constant), 0≤α < 1, α of the present invention
=0.9.The more new formula of learning rate is as follows during for each iteration:
η (k)=η (k-1) (1+ccos θ) (16)
In formula, c is constant, and the present invention takes 0.2.θ is most speed descent directionWith last weights knots modification Δ
ωk-1Between angle.
Step 5:The single model forecast of run-off
5.1 according to the principal component factor sequence and regional history footpath to be predicted that described in step 2.1, step 3.5 is extracted
Sequence normalization is flowed, is then divided into training sample and test samples, normalization formula is as follows:
Wherein, z be normalization after data, zmax=1, zmin=-1, z ∈ [- 1,1], q are original Inflow Sequence or master
One in components series, qminFor the minimum value in the sequence where q, qmaxFor the maximum in the sequence where q.
5.2 input using the factor data in training sample as network, the history footpath flow data conduct in training sample
The output of network, the learning training for having supervision is carried out to network.
5.3 pairs of networks after training, by the use of the factor in test samples as the input of network, the prediction effect to network
Fruit is tested.By the result renormalization of inspection, the footpath flow valuve predicted.
5.4 with average absolute percent error ((Mean Absolute Percentage Error, MAPE), relative error
(Relative Error, RE), maximum relative error (Maximum Relative Error, MRE), qualification rate (Qualified
Rate, QR) it is the evaluation index forecast, the calculation formula of each index is as follows:
In formula (18), (19) and (20)For the footpath flow valuve of probative term prediction, xiFor corresponding actual footpath flow valuve, j is inspection
The number of samples tested.
In formula (21), TQualFor qualified forecast number, TtotalFor total forecast number.According to《Hydrological Information and Forecasting is advised
Model》(GI3/T22482-2008) scheme of evaluation Medium-and Long-Term Runoff Forecasting precision in, maximum relative error of the present invention to forecast
Forecast less than 20% is qualified forecast.
Step 6:The DATA PROCESSING IN ENSEMBLE PREDICTION SYSTEM of run-off
In step 4, the present invention is realized to decline the gradient in Elman network weights space with back-propagation algorithm and searched
Rope, iteratively reduce the error between the actual value of history footpath flow data and the predicted value of network.But error surface may contain
Multiple different local minimums, during the gradient descent search to Elman network weights space, office may be rested on
Portion's minimum point, and it is not necessarily global minimizer.Therefore, even if the structure phase of each Elman networks after training
Together, but the connection weight parameter of model is also different, and this causes to deposit between each single Elman network models prediction result
In difference.In order to reduce the deviation of this prediction result caused by model parameter uncertainty, the present invention is repeatedly carried out
The single model forecast of run-off, using the average value of multiple forecast result as final forecast result.
Compared with prior art, the advantage of the invention is that:
(1) the first choosing method of the predictor based on standard mutual information, the linear relationship that can not only reflect between variable, also
Non-linear relation between energy response variable, the factor of selection is more representative, overcomes traditional based on Linear correlative analysis
The shortcomings that screening the method for the factor;
(2) core principle component analysis method (KPCA) is PCA (PCA) nonlinear extensions, that is, passes through mapping function
Original vector is mapped to high-dimensional feature space F by Φ, and PCA analyses are carried out on F.The data of linearly inseparable in luv space
In high-dimensional feature space nearly all can linear separability, be now PCA in higher dimensional space, the principal component of extraction, which more all has, to be represented
Property.Therefore the feature extracting method based on KPCA substantially increases the disposal ability of nonlinear data, with traditional based on PCA's
Feature extracting method is compared, advantageously.In addition, it is mutually orthogonal between the principal component through KPAC extractions, and data have passed through drop
Make an uproar de-redundancy, can be good at preventing the over-fitting of neutral net, improve the generalization ability of network;
(3) artificial neural network robustness is good, Nonlinear Mapping and self-learning capability are strong, can be good at excavating forecast because
Inner link between son and Forecasting Object.The Elman neutral nets that the present invention selects, it is a kind of typical dynamic regression net
Network, compared with conventional feedforward neural network (such as BP neural network), add undertaking layer more.Accept layer and be able to record one
The information of secondary network iteration and as the input of current iteration, this causes Elman networks to be more suitable for the prediction of time series data.
In addition, neutral net has the uncertain problem of parameter, in order to reduce the uncertainty of forecast, it is pre- to employ multi-model set
The method of report, improve forecast precision;
(4) it is used for NMI, the KPCA for Principle component extraction of factor primary election in the present invention with being used for Runoff Forecast
Elman neutral nets all have the disposal ability to linear nonlinear data in addition, and three kinds of Combination of Methods together, can overcome
The limitation of conventional method, improve the stability and accuracy of forecast.
Brief description of the drawings
Fig. 1 is the overview flow chart of the present invention;
Fig. 2 is the structure chart of Elman networks.
Embodiment
With reference to the accompanying drawings and examples, the present invention is further elaborated.
Fig. 1 is the overall flow figure of the present invention.By taking the forecast of Jinping Hydroelectric Power Station reservoir annual mean runoff as an example, press
According to flow chart, six steps can be divided into, step is as follows:
Step 1:Data prediction
1.1 collect regional history footpath flow datas to be predicted and have can be as the meteorological model data of predictor, often
Meteorological model data includes the indexs such as Atmospheric Circulation Characteristics, high-altitude field of pressure and sea surface temperature.What the present embodiment used
Data information include Jinping Hydroelectric Power Station reservoir range data of annual mean runoff year by year of 1960~2011 years and 1959~
74 circulation characteristic data month by month of 2010.
1.2 due to being that annual mean runoff is forecast, therefore the factor can not select out of the same period time then, meanwhile,
Hysteresis quality be present in view of influence of the meteorological factor to runoff, so, according to table 1, establish Jinping Hydroelectric Power Station year by year (1960
~2011 years) one-to-one corresponding of annual mean runoff and the 74 atmospheric circulation indexes of the previous year (1959~2010 years) month by month closes
System.The corresponding relation such as table 2 of wherein a certain item atmospheric circulation exponential time sequence and flow-through period sequence, other indexs are similar.
The corresponding relation of certain the atmospheric circulation exponential time sequence of table 2 and flow-through period sequence
Step 2:Predictor primary election based on standard mutual information
Index time series and annual flow time series are divided into two parts, training of the part as neural networks by 2.1
Sample, test samples of the another part as trained neutral net.The data of 47 years are used as training before the present embodiment
Sample, the data of latter 5 years are as test samples.
2.2 calculate mutual information MI.To training sample data, calculate respectively the time series of each index month by month with it is corresponding
Flow-through period sequence mutual information.For the present embodiment, i.e., according to the mean annual runoff sequence of the 1st row in formula (1) computational chart 2
Mutual information in row and table between the remaining index time series respectively arranged.It is worth noting that, for the reliability of test effect,
Mutual information only is calculated using training sample data, so as to screen preliminary predictor.Test samples data should not add.
2.3 normalized mutual information NMI, i.e., the MI values that step 2.2 is calculated are mapped to 0 with (2), (3) and (4)
Between 1.
The significance test (Significance Test) of 2.4 mutual informations.The present embodiment carries out mutual information using boot strap
Inspection, comprise the following steps:
2.4.1 the standard mutual information NMI values of former flow-through period sequence and index time series are calculated;
2.4.2 upset the order 100 times of two time serieses at random, calculate it is out of order after NMI values simultaneously by descending order row
Row;
2.4.3 take order arrangement NMI probability quantile as to should probability significance NMI threshold values;
If 2.4.4 former time series NMI values are more than NMI values corresponding to certain probability threshold value (the present embodiment takes 95%), recognize
It is significantly correlated for this two groups of data.
2.5 select the index work for being more than a certain threshold value (the present embodiment takes 0.9) by significance test and standard mutual information
For the predictor of primary election.In the present embodiment, index of the standard mutual information more than 0.9 has 205, and the information of preceding 20 indexs is such as
Under:
The predictor of 3 preceding 20 primary election of table
The factor of primary election | NMI | MI |
August sunspot | 0.988375 | 5.426929 |
April sunspot | 0.988375 | 5.426929 |
July sunspot | 0.988375 | 5.426929 |
October sunspot | 0.988375 | 5.426929 |
December sunspot | 0.988375 | 5.426929 |
2 months sunspots | 0.98444 | 5.384376 |
September sunspot | 0.98444 | 5.384376 |
November sunspot | 0.98444 | 5.384376 |
January sunspot | 0.98444 | 5.384376 |
March sunspot | 0.98444 | 5.384376 |
May sunspot | 0.98444 | 5.384376 |
August Northern Hemisphere pair high intensity index (5E-360) | 0.980474 | 5.341823 |
The Northern Hemisphere in March pole whirlpool area index (5th area, 0-360) | 0.980474 | 5.341823 |
Atlantic Ocean North America, north African in June pair high intensity index (110W-60E) | 0.976477 | 5.299270 |
Northern Hemisphere pair high intensity index in June (5E-360) | 0.976291 | 5.256717 |
Northern Hemisphere pair high intensity index in April (5E-360) | 0.972448 | 5.256717 |
Atlantic Ocean North America, north African in July pair high intensity index (110W-60E) | 0.972448 | 5.256717 |
Atlantic Ocean North America, September north African pair high intensity index (110W-60E) | 0.972448 | 5.256717 |
June sunspot | 0.972448 | 5.256717 |
Pacific Subtropical High intensity index in June (110E-115W) | 0.970919 | 5.240655 |
Step 3:Core principle component analysis is carried out, selects principal component as predictor.This example have selected in step 2.5
205 factor sequences, multicollinearity often be present between these factor sequences.Predictor with multicollinearity can be made
Weight matrix into neutral net increases, and the information and noise repeated can directly affect the training speed of neutral net and extensive
Ability, it is therefore desirable to carry out feature extraction, noise reduction de-redundancy.This example is from Radial basis kernel function as core principle component analysis
Kernel function, principal component is calculated according to formula (5), (6), (9), (11), (12), (13) and (14), obtained principal component is according to side
The order that the value of poor contribution rate is descending arranges, the variance contribution ratio such as table 4 of preceding 5 principal components of extraction, corresponding first 5
The data such as table 5 of main stor(e)y point.
The variance contribution ratio of 4 preceding 5 principal components of table
Principal component | Principal component _ 1 | Principal component _ 2 | Principal component _ 3 | Principal component _ 4 | Principal component _ 5 |
Variance contribution ratio | 25.7% | 6.9% | 5.6% | 5.1% | 3.9% |
Preceding 5 principal components of the KPCA of table 5 extractions
In the present embodiment, determine to select which principal component as predictor using trial-and-error method.Sent out by repetition test
Existing, when from the first two principal component as predictor, the value of forecasting of probative term is best, final to determine that predictor is selected
The first two principal component.It is worth noting that, in order to which standard used when training sample and test samples extract principal component is consistent
, it is necessary to KPCA will be carried out together with training sample sequence and test samples combined sequence.In the present embodiment, training sample sequence
Length be 47, the length of test samples sequence is 5, and the length of sequence samples and test samples combined sequence is 52, therefore, table
The sequence length for the principal component extracted in 4 is 52.
Step 4:Build Elman neural network models
4.1 structure Elman network models, need to determine network structure (i.e. the nodes of each layer of network) first.Determine network
The method of each node layer is as follows:
(1) node number of input layer (Input Layer) is equal to the number of predictive factor.The present embodiment has selected the first two
Principal component is as predictor, and therefore, Elman neural network input layers nodes are 2;
(2) output layer nodes are equal to the number of Forecasting Object, and the present embodiment is pre- to annual mean runoff progress monodrome
Report, therefore output layer node number is 1;
(3) accept node layer number and be equal to node in hidden layer;
(4) node in hidden layer has a major impact for the Generalization Capability of network, but there is presently no a system and
The method of standard determines node in hidden layer.One relatively good selection is exactly trial-and-error method, i.e., is implied by using different
Node layer number, the value of forecasting of network is observed, so that it is determined that the number of hidden layer node.In the present embodiment, because early stage is used
KPCA has carried out noise reduction, de-redundancy to factor data, and orthogonal between obtained principal component, can effectively prevent nerve excessively
The over-fitting of network, so, when node in hidden layer is respectively 3,4,5,6,7,8,9,10,11,12,13 and 15, probative term
For the relative error of interior forecast all within 20%, network is very stable, has good generalization ability.By testing repeatedly, when hidden
When number containing node layer is 10, the maximum relative error of probative term forecast falls below 15%, it is thus determined that node in hidden layer is 10.
4.2 build Elman network models, it is also necessary to determine the training algorithm of network.The present embodiment is calculated using backpropagation
Method and the weights with momentum term and the adaptive gradient descent algorithm more row network of learning rate.Right value update formula see formula (15) and
Formula (16).
Step 5:The single model forecast of run-off
5.1 according to the principal component factor sequence and regional history footpath to be predicted that described in step 2.1, step 3.5 is extracted
Flow sequence to normalize according to formula (1), be then divided into training sample and test samples.In the present embodiment, two step 3 selected
The data of 47 years are as training sample, the number of latter 5 years before individual chief composition series and Jinping Hydroelectric Power Station mean annual runoff sequence
According to as test samples.
5.2 input using the factor data in training sample as network, the history footpath flow data conduct in training sample
The output of network, the learning training for having supervision is carried out to network.Learning process can be summarized as follows:
(1) using the connection weight coefficient between random function initialization each layer of network, and cost function (Cost is determined
Function) the error ε allowed.The present embodiment cost function using mean square error function (Mean Squared Error,
MSE);
(2) to network inputs learning sample, combination algorithm calculates the value E of mean square error function, and each according to E renewal networks
Connection weight between layer;
(3) when E value is more than ε, step (2) is gone to, otherwise study terminates, calculating network output.
5.3 pairs of networks after training, by the use of the factor data in test samples as the input of network, to the pre- of network
Effect is surveyed to test.By the result renormalization of inspection, the footpath flow valuve predicted.
5.4 ((Mean Absolute Percentage Error, MAPE), are missed greatly relatively with average absolute percent error
Poor (Maximum Relative Error, MRE), qualification rate (Qualified Rate, QR) are the evaluation index of forecast, are respectively referred to
Mark calculates according to formula (17), (18), (19) and (20).In order to verify the generalization ability of network model and forecast in the present invention
Stability, the present embodiment have carried out 100 single model forecast, as a result found, the maximum relative error of forecast in each probative term
All within 16%, qualification rate has reached 100%.Illustrate the network model used in the present invention have good generalization ability and
Forecast stability.The error statistics such as table 6 of the forecast of wherein preceding 5 probative terms.
The single model probative term prediction error of table 6 counts
Step 6:The DATA PROCESSING IN ENSEMBLE PREDICTION SYSTEM of run-off
In order to reduce the deviation of the prediction result caused by model parameter uncertainty, the present invention repeatedly carries out runoff
The single model forecast of amount, using the average value of multiple forecast result as final forecast result., can be by 100 in the present embodiment
The average value of the result of secondary forecast is as final forecast result.
Embodiments of the invention is the foregoing is only, is not intended to limit the invention.All principles in the present invention
Within, the equivalent substitution made should be included in the scope of the protection.The content category that the present invention is not elaborated
In prior art known to this professional domain technical staff.
Claims (5)
- A kind of 1. Medium-and Long-Term Runoff Forecasting method based on mutual information-core principle component analysis-Elman networks, it is characterised in that should Method includes the predictor primary election based on mutual information;Principal component is extracted with core principle component analysis;Build Elman neutral nets Model;The single model forecast of run-off;The multi-model DATA PROCESSING IN ENSEMBLE PREDICTION SYSTEM of run-off.
- 2. the Medium-and Long-Term Runoff Forecasting method as claimed in claim 1 based on mutual information-core principle component analysis-Elman networks, Characterized in that, the predictor primary election based on mutual information comprises the steps of:(1) mutual information MI of each index time series with corresponding flow-through period sequence is calculated:<mrow> <mi>M</mi> <mi>I</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mi>l</mi> <mi>o</mi> <mi>g</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>)</mo> </mrow> </mrow>In formula, X is flow-through period sequence, X=(x1,x2,x3...xn)T, Y be index time series, Y=(y1,y2, y3...yn)T, molecule p (xi,yj) be X and Y joint distribution principle, p (xi)、p(yj) be respectively X and Y edge distribution rule;(2) do denominator with entropy MI values are mapped between 0 and 1, obtain standard mutual information NMI:<mrow> <mi>N</mi> <mi>M</mi> <mi>I</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mn>2</mn> <mfrac> <mrow> <mi>M</mi> <mi>I</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>H</mi> <mrow> <mo>(</mo> <mi>Y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow><mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <msub> <mi>log</mi> <mn>2</mn> </msub> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow>Wherein, H (X) and H (Y) is respectively X and Y entropy, and H (Y) is similar to H (X) calculation formula;(3) inspection of standard mutual information is carried out using boot strap.
- 3. the Medium-and Long-Term Runoff Forecasting method as claimed in claim 1 based on mutual information-core principle component analysis-Elman networks, Characterized in that, comprised the steps of with core principle component analysis extraction principal component:(1) the predictor data z-score standardization of primary election;(2) the nuclear matrix K of the predictor of primary election;(3) the nuclear matrix K of centralization is calculatedc, and its eigen vector is calculated, characteristic value according to descending Order arranges, and the order of characteristic vector does corresponding adjustment according to characteristic value;(4) normalized eigenvectors matrix A is calculated, and calculates the nuclear matrix K of centralizationcProjection in characteristic vector, is obtained Principal component.
- 4. the Medium-and Long-Term Runoff Forecasting method as claimed in claim 1 based on mutual information-core principle component analysis-Elman networks, Characterized in that, the single model forecast of run-off comprises the steps of:(1) Elman network models are used, make single model forecast to run-off.
- 5. the Medium-and Long-Term Runoff Forecasting method as claimed in claim 1 based on mutual information-core principle component analysis-Elman networks, Characterized in that, the multi-model DATA PROCESSING IN ENSEMBLE PREDICTION SYSTEM of run-off comprises the steps of:(1) Elman network models are used, multiple single model forecast is done to run-off;(2) using the result average value repeatedly forecast as last forecast result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710662894.4A CN107463993B (en) | 2017-08-04 | 2017-08-04 | Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710662894.4A CN107463993B (en) | 2017-08-04 | 2017-08-04 | Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107463993A true CN107463993A (en) | 2017-12-12 |
CN107463993B CN107463993B (en) | 2020-11-24 |
Family
ID=60547269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710662894.4A Active CN107463993B (en) | 2017-08-04 | 2017-08-04 | Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107463993B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108537679A (en) * | 2018-02-08 | 2018-09-14 | 中国农业大学 | The regional scale crop emergence date evaluation method that remote sensing is merged with crop modeling |
CN109492825A (en) * | 2018-11-26 | 2019-03-19 | 中国水利水电科学研究院 | Medium-long Term Prediction method based on mutual information and the principal component analysis screening factor |
CN109671507A (en) * | 2018-12-24 | 2019-04-23 | 万达信息股份有限公司 | A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record |
CN110334546A (en) * | 2019-07-08 | 2019-10-15 | 辽宁工业大学 | Difference privacy high dimensional data based on principal component analysis optimization issues guard method |
CN110852497A (en) * | 2019-10-30 | 2020-02-28 | 南京智慧航空研究院有限公司 | Scene variable slide-out time prediction system based on big data deep learning |
CN111310968A (en) * | 2019-12-20 | 2020-06-19 | 西安电子科技大学 | LSTM neural network circulation hydrological forecasting method based on mutual information |
CN111445085A (en) * | 2020-04-13 | 2020-07-24 | 中国水利水电科学研究院 | Medium-and-long-term runoff forecasting method considering influence of medium-and-large-sized reservoir engineering water storage |
CN112766531A (en) * | 2019-11-06 | 2021-05-07 | 中国科学院国家空间科学中心 | Runoff prediction system and method based on satellite microwave observation data |
CN117114523A (en) * | 2023-10-23 | 2023-11-24 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on condition mutual information |
CN117132176A (en) * | 2023-10-23 | 2023-11-28 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080108881A1 (en) * | 2004-07-10 | 2008-05-08 | Steven Elliot Stupp | Apparatus for aggregating individuals based on association variables |
CN102122370A (en) * | 2011-03-07 | 2011-07-13 | 北京师范大学 | Method for predicting river basin climatic change and analyzing tendency |
CN104008164A (en) * | 2014-05-29 | 2014-08-27 | 华东师范大学 | Generalized regression neural network based short-term diarrhea multi-step prediction method |
CN104091074A (en) * | 2014-07-12 | 2014-10-08 | 西安浐灞生态区管理委员会 | Medium and long term hydrologic forecasting method based on empirical mode decomposition |
CN104463358A (en) * | 2014-11-28 | 2015-03-25 | 大连理工大学 | Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast |
CN104869126A (en) * | 2015-06-19 | 2015-08-26 | 中国人民解放军61599部队计算所 | Network intrusion anomaly detection method |
CN104951847A (en) * | 2014-12-31 | 2015-09-30 | 广西师范学院 | Rainfall forecast method based on kernel principal component analysis and gene expression programming |
CN105139093A (en) * | 2015-09-07 | 2015-12-09 | 河海大学 | Method for forecasting flood based on Boosting algorithm and support vector machine |
CN105354416A (en) * | 2015-10-26 | 2016-02-24 | 南京南瑞集团公司 | Representative power station based basin rainfall runoff power macro-forecasting method |
CN105678422A (en) * | 2016-01-11 | 2016-06-15 | 广东工业大学 | Empirical mode neural network-based chaotic time series prediction method |
US20170039659A1 (en) * | 2014-04-11 | 2017-02-09 | Wuhan University | Daily electricity generation plan making method of cascade hydraulic power plant group |
CN106845371A (en) * | 2016-12-31 | 2017-06-13 | 中国科学技术大学 | A kind of city road network automotive emission remote sensing monitoring system |
CN106971237A (en) * | 2017-02-27 | 2017-07-21 | 中国水利水电科学研究院 | A kind of Medium-and Long-Term Runoff Forecasting method for optimized algorithm of being looked for food based on bacterium |
-
2017
- 2017-08-04 CN CN201710662894.4A patent/CN107463993B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080108881A1 (en) * | 2004-07-10 | 2008-05-08 | Steven Elliot Stupp | Apparatus for aggregating individuals based on association variables |
CN102122370A (en) * | 2011-03-07 | 2011-07-13 | 北京师范大学 | Method for predicting river basin climatic change and analyzing tendency |
US20170039659A1 (en) * | 2014-04-11 | 2017-02-09 | Wuhan University | Daily electricity generation plan making method of cascade hydraulic power plant group |
CN104008164A (en) * | 2014-05-29 | 2014-08-27 | 华东师范大学 | Generalized regression neural network based short-term diarrhea multi-step prediction method |
CN104091074A (en) * | 2014-07-12 | 2014-10-08 | 西安浐灞生态区管理委员会 | Medium and long term hydrologic forecasting method based on empirical mode decomposition |
CN104463358A (en) * | 2014-11-28 | 2015-03-25 | 大连理工大学 | Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast |
CN104951847A (en) * | 2014-12-31 | 2015-09-30 | 广西师范学院 | Rainfall forecast method based on kernel principal component analysis and gene expression programming |
CN104869126A (en) * | 2015-06-19 | 2015-08-26 | 中国人民解放军61599部队计算所 | Network intrusion anomaly detection method |
CN105139093A (en) * | 2015-09-07 | 2015-12-09 | 河海大学 | Method for forecasting flood based on Boosting algorithm and support vector machine |
CN105354416A (en) * | 2015-10-26 | 2016-02-24 | 南京南瑞集团公司 | Representative power station based basin rainfall runoff power macro-forecasting method |
CN105678422A (en) * | 2016-01-11 | 2016-06-15 | 广东工业大学 | Empirical mode neural network-based chaotic time series prediction method |
CN106845371A (en) * | 2016-12-31 | 2017-06-13 | 中国科学技术大学 | A kind of city road network automotive emission remote sensing monitoring system |
CN106971237A (en) * | 2017-02-27 | 2017-07-21 | 中国水利水电科学研究院 | A kind of Medium-and Long-Term Runoff Forecasting method for optimized algorithm of being looked for food based on bacterium |
Non-Patent Citations (2)
Title |
---|
LIONG S Y等: "Flood stage forecasting with SVM", 《JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION》 * |
李薇等: "基于主成分分析的三种中长期预报模型在柘溪水库的应用", 《水力发电》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108537679B (en) * | 2018-02-08 | 2022-04-12 | 中国农业大学 | Remote sensing and crop model fused region scale crop emergence date estimation method |
CN108537679A (en) * | 2018-02-08 | 2018-09-14 | 中国农业大学 | The regional scale crop emergence date evaluation method that remote sensing is merged with crop modeling |
CN109492825A (en) * | 2018-11-26 | 2019-03-19 | 中国水利水电科学研究院 | Medium-long Term Prediction method based on mutual information and the principal component analysis screening factor |
CN109671507A (en) * | 2018-12-24 | 2019-04-23 | 万达信息股份有限公司 | A kind of obstetrics' disease that calls for specialized treatment coupling index method for digging based on Electronic Health Record |
CN110334546A (en) * | 2019-07-08 | 2019-10-15 | 辽宁工业大学 | Difference privacy high dimensional data based on principal component analysis optimization issues guard method |
CN110852497A (en) * | 2019-10-30 | 2020-02-28 | 南京智慧航空研究院有限公司 | Scene variable slide-out time prediction system based on big data deep learning |
CN112766531B (en) * | 2019-11-06 | 2023-10-31 | 中国科学院国家空间科学中心 | Runoff prediction system and method based on satellite microwave observation data |
CN112766531A (en) * | 2019-11-06 | 2021-05-07 | 中国科学院国家空间科学中心 | Runoff prediction system and method based on satellite microwave observation data |
CN111310968A (en) * | 2019-12-20 | 2020-06-19 | 西安电子科技大学 | LSTM neural network circulation hydrological forecasting method based on mutual information |
CN111310968B (en) * | 2019-12-20 | 2024-02-09 | 西安电子科技大学 | LSTM neural network circulating hydrologic forecasting method based on mutual information |
CN111445085A (en) * | 2020-04-13 | 2020-07-24 | 中国水利水电科学研究院 | Medium-and-long-term runoff forecasting method considering influence of medium-and-large-sized reservoir engineering water storage |
CN117114523A (en) * | 2023-10-23 | 2023-11-24 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on condition mutual information |
CN117132176A (en) * | 2023-10-23 | 2023-11-28 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening |
CN117132176B (en) * | 2023-10-23 | 2024-01-26 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening |
CN117114523B (en) * | 2023-10-23 | 2024-02-02 | 长江三峡集团实业发展(北京)有限公司 | Runoff forecasting model construction and runoff forecasting method based on condition mutual information |
Also Published As
Publication number | Publication date |
---|---|
CN107463993B (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107463993A (en) | Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks | |
CN106650767B (en) | Flood forecasting method based on cluster analysis and real-time correction | |
CN112966954B (en) | Flood control scheduling scheme optimization method based on time convolution network | |
CN108022001A (en) | Short term probability density Forecasting Methodology based on PCA and quantile estimate forest | |
CN106886846A (en) | A kind of bank outlets' excess reserve Forecasting Methodology that Recognition with Recurrent Neural Network is remembered based on shot and long term | |
Dikbas et al. | Classification of precipitation series using fuzzy cluster method | |
Li et al. | A new flood forecasting model based on SVM and boosting learning algorithms | |
CN110414788A (en) | A kind of power quality prediction technique based on similar day and improvement LSTM | |
CN103177301A (en) | Typhoon disaster risk estimate method | |
CN109583565A (en) | Forecasting Flood method based on the long memory network in short-term of attention model | |
CN109143408B (en) | Dynamic region combined short-time rainfall forecasting method based on MLP | |
CN109492748B (en) | Method for establishing medium-and-long-term load prediction model of power system based on convolutional neural network | |
CN104807589B (en) | A kind of ONLINE RECOGNITION method collecting flow pattern of gas-liquid two-phase flow in defeated-riser systems | |
CN113344288B (en) | Cascade hydropower station group water level prediction method and device and computer readable storage medium | |
CN107798431A (en) | A kind of Medium-and Long-Term Runoff Forecasting method based on Modified Elman Neural Network | |
Danandeh Mehr | Drought classification using gradient boosting decision tree | |
CN112232561A (en) | Power load probability prediction method based on constrained parallel LSTM quantile regression | |
Zhang et al. | Surface and high-altitude combined rainfall forecasting using convolutional neural network | |
CN116050595A (en) | Attention mechanism and decomposition mechanism coupled runoff amount prediction method | |
CN106405683B (en) | Wind speed forecasting method and device based on G-L mixed noise characteristic core ridge regression technology | |
CN107368933A (en) | A kind of photovoltaic power Forecasting Methodology being fitted based on fit and coefficient correlation | |
CN109408896B (en) | Multi-element intelligent real-time monitoring method for anaerobic sewage treatment gas production | |
CN111914488B (en) | Data area hydrologic parameter calibration method based on antagonistic neural network | |
Khadr et al. | Data-driven stochastic modeling for multi-purpose reservoir simulation | |
Basin | Adaptive neuro fuzzy inference system for monthly groundwater level prediction in Amaravathi river minor basin |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |