CN116663404A - Flood forecasting method and system coupling artificial intelligence and Bayesian theory - Google Patents
Flood forecasting method and system coupling artificial intelligence and Bayesian theory Download PDFInfo
- Publication number
- CN116663404A CN116663404A CN202310592943.7A CN202310592943A CN116663404A CN 116663404 A CN116663404 A CN 116663404A CN 202310592943 A CN202310592943 A CN 202310592943A CN 116663404 A CN116663404 A CN 116663404A
- Authority
- CN
- China
- Prior art keywords
- precipitation
- model
- flood
- bayesian
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 15
- 238000013277 forecasting method Methods 0.000 title claims abstract description 13
- 230000008878 coupling Effects 0.000 title claims description 8
- 238000010168 coupling process Methods 0.000 title claims description 8
- 238000005859 coupling reaction Methods 0.000 title claims description 8
- 238000001556 precipitation Methods 0.000 claims abstract description 140
- 238000012549 training Methods 0.000 claims abstract description 49
- 238000010801 machine learning Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000000513 principal component analysis Methods 0.000 claims abstract description 11
- 238000011160 research Methods 0.000 claims description 16
- 230000000694 effects Effects 0.000 claims description 8
- 238000004088 simulation Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 6
- 238000007637 random forest analysis Methods 0.000 claims description 5
- 238000012706 support-vector machine Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 230000005855 radiation Effects 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 229930002875 chlorophyll Natural products 0.000 claims description 3
- 235000019804 chlorophyll Nutrition 0.000 claims description 3
- ATNHDLDRLWWWCB-AENOIHSZSA-M chlorophyll a Chemical compound C1([C@@H](C(=O)OC)C(=O)C2=C3C)=C2N2C3=CC(C(CC)=C3C)=[N+]4C3=CC3=C(C=C)C(C)=C5N3[Mg-2]42[N+]2=C1[C@@H](CCC(=O)OC\C=C(/C)CCC[C@H](C)CCC[C@H](C)CCCC(C)C)[C@H](C)C2=C5 ATNHDLDRLWWWCB-AENOIHSZSA-M 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 230000007787 long-term memory Effects 0.000 claims description 2
- 230000006403 short-term memory Effects 0.000 claims description 2
- 230000008859 change Effects 0.000 abstract description 7
- 230000008901 benefit Effects 0.000 abstract description 5
- 230000007246 mechanism Effects 0.000 abstract description 5
- 230000008569 process Effects 0.000 abstract description 4
- 230000009467 reduction Effects 0.000 abstract description 3
- 238000013179 statistical model Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 5
- 230000015654 memory Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000029553 photosynthesis Effects 0.000 description 3
- 238000010672 photosynthesis Methods 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 230000029058 respiratory gaseous exchange Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 238000006424 Flood reaction Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01W—METEOROLOGY
- G01W1/00—Meteorology
- G01W1/10—Devices for predicting weather conditions
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01W—METEOROLOGY
- G01W1/00—Meteorology
- G01W1/14—Rainfall or precipitation gauges
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G08—SIGNALLING
- G08B—SIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
- G08B21/00—Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
- G08B21/02—Alarms for ensuring the safety of persons
- G08B21/10—Alarms for ensuring the safety of persons responsive to calamitous events, e.g. tornados or earthquakes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/08—Probabilistic or stochastic CAD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/10—Numerical modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Environmental & Geological Engineering (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Probability & Statistics with Applications (AREA)
- Environmental Sciences (AREA)
- Ecology (AREA)
- Biodiversity & Conservation Biology (AREA)
- Atmospheric Sciences (AREA)
- Geometry (AREA)
- Computer Hardware Design (AREA)
- Emergency Management (AREA)
- Business, Economics & Management (AREA)
- Geology (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Hydrology & Water Resources (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
Abstract
The invention belongs to the technical field of runoff prediction, and particularly discloses a flood forecasting method and a system coupled with artificial intelligence and Bayesian theory, wherein the flood forecasting method comprises the following steps: calculating correlation coefficients between the alternative precipitation influence factors and precipitation amount respectively, and obtaining final precipitation influence factors through principal component analysis; inputting final precipitation influence factors into Bayesian-NHMM models with different precipitation states to carry out precipitation forecast, and determining an optimal precipitation forecast model; taking the influence of climate change and underlying surface change on runoff into consideration, and constructing a training sample based on a rainfall value predicted by an optimal rainfall prediction model; and training the machine learning model through the training sample to obtain a runoff forecasting model, so as to realize flood forecasting. The invention combines the advantages of a mathematical statistical model and a machine learning method, fully considers the evolution process of the basin yield convergence mechanism under the changing environment, and can provide an important reference basis with strong operability for flood control and disaster reduction in the flood area.
Description
Technical Field
The invention belongs to the technical field of runoff prediction, and particularly relates to a flood forecasting method and system coupled with artificial intelligence and Bayesian theory.
Background
Large watershed floods can cause significant economic losses, and studies have shown that precipitation is the most dominant contributor to regional runoff. Therefore, the traditional runoff forecasting is only based on actually measured rainfall and runoff data, the forecasting period is short, the forecasting precision is limited, and the requirements of flood disaster emergency management are difficult to meet. To prolong the forecast period of flood forecast, firstly, the rainfall forecast needs to be considered, and a rainfall forecast-runoff forecast coupling model is established. As an important link of water circulation, the precipitation process is not only affected by remote related driving of large-scale climate factors, but also closely related to various local hydrological factors and underlying conditions. How to fully consider the influence of large scale and local factors on regional precipitation process, a reasonable and reliable precipitation prediction model is constructed, and the model is coupled with the regional runoff prediction model, so that the model is one of the difficulties in developing long-prediction-period and high-precision flood prediction.
In addition, under a changing environment, each section of flood in the river basin has the dynamic characteristics of nonlinearity, high dimensionality and complex space-time distribution, and besides precipitation, the flood is also related to local hydrological meteorological factors such as near-ground wind speed, relative humidity, soil humidity and the like. In addition, the vegetation cover can participate in regional water carbon circulation through respiration and photosynthesis, and is also an important factor for influencing regional yield and confluence characteristics. How to comprehensively consider the multi-factor influence under the changing environment, downscaling the local hydrological meteorological factors and the underlying coverage conditions into the runoff forecast, acquiring river basin flood forecast information based on physical mechanism analysis, prolonging the flood forecast prediction period and improving the forecast precision, and constructing a medium-long-term flood early warning mechanism is another difficult problem of current flood forecast.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides a flood forecasting method and a system coupled with artificial intelligence and Bayesian theory, which aim to solve the problems of short forecasting period and low forecasting precision of flood in a complex environment.
To achieve the above object, according to a first aspect of the present invention, a flood forecasting method coupled with artificial intelligence and bayesian theory is provided, comprising the steps of:
s1, collecting various flood related data of a research area as an alternative rainfall influence factor;
s2, calculating correlation coefficients between each alternative precipitation influence factor and precipitation amount respectively, screening out precipitation influence factors with correlation coefficients exceeding a preset threshold value, and further processing the screened precipitation influence factors through principal component analysis to obtain final precipitation influence factors; constructing a Bayesian-NHMM model, and determining a plurality of alternative values of the precipitation state number k in the Bayesian-NHMM model;
s3, inputting a final precipitation influence factor into the Bayesian-NHMM model when k takes different alternative values, carrying out precipitation prediction, and taking the Bayesian-NHMM model with the minimum precipitation prediction error as an optimal precipitation prediction model;
s4, constructing a training sample based on a series of rainfall values predicted by the optimal rainfall prediction model, and classifying the training sample to obtain N types of training subsamples;
s5, training the machine learning model through N training subsamples to obtain N trained machine learning models serving as runoff forecasting models; and realizing flood forecast based on the runoff forecast model.
As a further preferred aspect, the flood related data comprises large scale climate factor data comprising an ENSO index, an atmospheric flow index and an indian ocean dipole, hydrographic factor data comprising investigation region runoff, precipitation, air temperature, wind speed, air humidity and net radiation amount, and underlying effect factor data comprising investigation region leaf area index and sunlight induced chlorophyll fluorescence.
As a further preferred aspect, step S4 specifically includes:
building a training sample: the training sample comprises a rainfall value predicted by the optimal rainfall prediction model, and hydrological factor data and underlying effect factor data;
and dividing the rainfall value predicted by the optimal rainfall prediction model into N classes according to the height, and further dividing other training samples into N classes through a K-means clustering method to obtain N classes of training subsamples.
As a further preferred aspect, in step S2, the method for determining the final precipitation impact factor specifically includes:
(1) Calculating correlation coefficients between each alternative precipitation predictor and precipitation amount respectively;
(2) Selecting a precipitation predictor with a correlation coefficient exceeding a preset threshold value, and processing the screened precipitation predictor through principal component analysis;
(3) Performing k-fold cross validation on the processed rainfall forecast factors, and if the validation is not passed, adjusting a preset threshold value and returning to the step (2); and if the verification is passed, the precipitation predictor after the treatment is the final precipitation influence factor.
As a further preferable mode, in step S2, when calculating the correlation coefficient, if the candidate precipitation predictor and the precipitation amount are in a linear correlation relationship, a Pearson correlation coefficient is adopted; and if the alternative precipitation prediction factor and the precipitation amount are in a nonlinear correlation relationship, adopting a mutual information index as a correlation coefficient.
As a further preferred alternative, in step S2, several alternative values of the number k of precipitation states in the Bayesian-NHMM model are determined, in particular:
presetting a plurality of k values, calculating BIC function values corresponding to the Bayesian-NHMM model when different k values, thereby establishing a correlation curve graph of the k values and the BIC function values, and taking the k values corresponding to inflection points in the correlation curve graph as alternative values.
As further preferable, in step S5, the machine learning models are four, specifically, an artificial neural network, a support vector machine, a random forest model and a long-term and short-term memory model;
respectively inputting any type of training sub-sample into four machine learning models for training to obtain four trained machine learning models as runoff forecasting sub-models; and further determining the weight of each runoff forecasting sub-model according to the simulation precision of each runoff forecasting sub-model on the runoffs, thereby obtaining the runoff forecasting model corresponding to the training sub-sample.
As a further preferred aspect, the flood forecast based on the runoff forecast model comprises:
and acquiring relevant data of the research area in real time, respectively calculating the similarity of the relevant data and N types of training subsamples, selecting a runoff forecasting model corresponding to the training subsamples with the highest similarity, and inputting the relevant data into the runoff forecasting model to obtain forecasting runoffs, thereby realizing flood forecasting.
According to a second aspect of the present invention, there is provided a system for implementing the above-described flood forecasting method coupled with artificial intelligence and bayesian theory, comprising:
the database construction module is used for collecting various flood related data of the research area and taking the flood related data as an alternative precipitation influence factor;
the first parameter estimation module is used for respectively calculating correlation coefficients between each candidate precipitation influence factor and precipitation amount, screening out precipitation influence factors with the correlation coefficients exceeding a preset threshold value, and further processing the screened precipitation influence factors through principal component analysis to obtain final precipitation influence factors; constructing a Bayesian-NHMM model, and determining a plurality of alternative values of the precipitation state number k in the Bayesian-NHMM model;
the precipitation prediction model construction module is used for inputting the final precipitation influence factors into the Bayesian-NHMM model when k takes different alternative values, carrying out precipitation prediction, and taking the Bayesian-NHMM model with the minimum precipitation prediction error as an optimal precipitation prediction model;
the second parameter estimation module is used for constructing a training sample based on a series of rainfall values predicted by the optimal rainfall prediction model, classifying the training sample, and obtaining N types of training subsamples;
the combined forecasting module is used for training the machine learning models through the N training subsamples respectively to obtain N trained machine learning models as runoff forecasting models, and flood forecasting is achieved based on the runoff forecasting models.
According to a third aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the above-described flood forecasting method of coupling artificial intelligence and bayesian theory.
In general, compared with the prior art, the above technical solution conceived by the present invention mainly has the following technical advantages:
1. according to the invention, precipitation prediction is carried out by designing a Bayesian-NHMM model, and then a machine learning model is trained by taking a predicted value as training data, so that runoff prediction is carried out; the advantages of a mathematical statistical model Bayesian-NHMM and a current machine learning model are fully utilized, the problems of short prediction period and low prediction precision of the traditional runoff prediction are solved, future flood risks can be deduced, flood risk early warning is carried out, and basis is provided for starting emergency response of flood prevention decision departments.
2. According to the invention, the actually measured hydrological meteorological data of a research area are fully utilized, particularly, the influence of the current climate change and the underlying surface change (which is emphasized by considering the influence of vegetation photosynthesis and respiration) on a current area yield convergence mechanism is considered, and the Bayesian-NHMM precipitation prediction model and the runoff machine learning model are coupled, so that a strong statistical basis is provided, and the regional yield characteristics can be objectively reflected.
Drawings
FIG. 1 is a flow chart of a flood forecast method coupling artificial intelligence and Bayesian theory in accordance with an embodiment of the present invention;
FIG. 2 is a flow chart of precipitation impact factor screening according to an embodiment of the invention;
FIG. 3 is a flowchart of the Bayesian-NHMM model work flow for embodiments of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
According to the invention, based on actually measured large-scale climate factors, hydrological data of a research area, LAI and SIF index long sequences, influences of various factors on precipitation are fully considered, principal component analysis and K-fold cross validation are adopted to determine precipitation prediction influence factors, a Bayesian-NHMM precipitation prediction model is constructed to predict precipitation according to BIC criteria, classification and grading of runoff prediction factors are realized based on a K-means clustering method, various runoff prediction factors are input into an artificial neural network ANN, a support vector machine SVM, a random forest model RF and a long-short term memory model LSTM, cross section prediction runoff combinations under the influence of different physical factors are obtained, the model prediction runoffs are weighted and averaged based on root mean square error reciprocal, final cross section runoff prediction results of the research area are obtained, and medium-long term flood risk early warning results are estimated.
The technical scheme of the invention is further specifically described by the following embodiments and with reference to the accompanying drawings, and as shown in fig. 1, the medium-long term flood forecasting method coupled with artificial intelligence and bayesian theory provided by the embodiment of the invention comprises the following steps:
step S1: and (5) constructing a database. The acquired data comprise large-scale climate factor data, hydrological factor data and underlying effect factor data, and are used as alternative precipitation effect factors.
Specifically, step S1 includes:
acquiring an ENSO index, an atmospheric circulation index, an Indian ocean dipole and the like in the large-scale climate factors, and constructing a large-scale climate factor database;
determining a research area, collecting the actually measured hydrological factor data of the research area, including runoff, precipitation, air temperature, wind speed, air humidity and net radiation quantity, and constructing a hydrological factor database;
in addition, leaf area index (LAI, an important parameter for measuring the intensity of the ecological system and the energy and substance exchange in the atmosphere) and sunlight-induced chlorophyll fluorescence (SIF, an important parameter for indicating the photosynthesis and physiological state of vegetation) in the study area are collected, and an underlying influence factor database is constructed.
Step S2: and (5) screening precipitation influence factors and determining the number of states. As shown in fig. 2, the importance of the precipitation influence factors is primarily ordered based on the linear correlation coefficient and the nonlinear correlation mutual information index, the influence factors are subjected to dimension reduction processing through principal component analysis, and finally the final precipitation influence factors are determined through k-fold cross validation; determining the number of precipitation states in a Bayesian hidden Markov precipitation probability prediction model (Bayesian-NHMM) by means of a BIC function.
Specifically, step S2 includes:
s2.1: and calculating the correlation ordering between the alternative precipitation influence factors and precipitation amount.
(1) Calculating correlation coefficients between each candidate precipitation predictor and precipitation amount respectively: calculating the linear correlation by using a Pearson correlation coefficient; the nonlinear correlation is calculated using mutual information indicators. Specifically, all the candidate rainfall predictors and the rainfall are respectively subjected to linear and nonlinear analysis, and the linear/nonlinear relation is judged according to the significance level. In addition, considering the hysteresis of large-scale climate factors, the influence of LAI and SIF on regional precipitation, when the correlation of the factors on regional precipitation is calculated, the correlation of the factors and the precipitation in the first 1-12 months is calculated, and then the month with the largest correlation of the factors on the precipitation is respectively screened out.
(2) Precipitation impact factors are preferred based on principal component analysis: and selecting the rainfall predictor with the correlation coefficient exceeding a preset threshold value, and performing dimension reduction treatment on the screened rainfall predictor through principal component analysis.
(3) Determining a final precipitation influence factor through k-fold cross validation: performing k-fold cross validation on the processed rainfall forecast factors, and if the validation is not passed, adjusting a preset threshold value and returning to the step (2); and if the verification is passed, the precipitation predictor after the treatment is the final precipitation influence factor. In this embodiment, k=0.6, that is, 0.6 times of data length is taken for calibration, and 0.4 times of data length is taken for verification, so as to determine the final precipitation influence factor.
S2.2: the number of precipitation states in a Bayesian hidden Markov precipitation probability prediction model (Bayesian-NHMM) is determined based on the BIC function.
As shown in FIG. 3, the Bayesian-NHMM model is a model based on hidden state transformations in which precipitation on each day corresponds to a hidden state, the state transformations being determined by first order Markov chain features. Q in FIG. 3 t Probability transition matrix, X, taking into account influencing variables for moment t t,b 、X t,a Influencing variables for different categories of exogenously. The probability of transition between different states is time-varying and closely related to the input predictors. Thus, the Bayesian-NHMM model can reflect precipitation and predictive factorsSpatial correlation and temporal dependence between the children. The selection of different numbers of hidden states has a great influence on the simulation result of the model, so that the determination of the number of precipitation states is an important step in the establishment of the model. Determination of the number of states k may be accomplished by comparing BIC function values with models having different k values. The BIC function value is calculated by the following steps:
BIC=2P-klog(T) (1)
wherein: p is a maximum likelihood estimate, which is model dependent; k is the number of precipitation states of the model; t is the number of days of data. Precipitation probability distribution maximum likelihood estimation P (R) based on hidden state (Z) t |x, Z, δ, θ) can be calculated by the following formula:
wherein: t=1, 2, … T is time; r is R t For observing precipitation at time t, X t =X t,1 ,X t,2 ,…X t,p For the p precipitation predictors at time t, θ is the unknown parameter related to the probability matrix, δ is the unknown parameter related to the transition probability, ρ j Is equal to X t Coefficients in the associated probability distribution, let δ=δ i,j =(ρ j ,ε i,j )。
Through the formula, the BIC function values corresponding to the Bayesian-NHMM model when different k values are obtained through calculation, so that a correlation curve graph of the k values and the BIC function values is established, and the k values corresponding to all inflection points in the correlation curve graph are used as alternative values. If the inflection point does not exist, the k value corresponding to the minimum BIC value is selected.
Step S3: precipitation probability prediction based on Bayesian-NHMM model. Based on the optimal multiple groups of predictors and k values of the precipitation states, a Bayesian-NHMM model of different precipitation predictor combinations is constructed, and an optimal precipitation prediction model is determined by taking root mean square error offset coefficient as a reference, so that a precipitation prediction result is obtained.
Specifically, step S3 includes:
and (3) constructing a Bayesian-NHMM model of different precipitation forecasting factor combinations according to the final precipitation influence factors and the precipitation state number k values determined in the step (S2). Then, calculating root mean square error offset coefficients (the coefficient of variation of root mean squared error, CVRMSE) of simulation results of different precipitation predictors, namely, calculating the precipitation prediction results and the CVRMSE of the observed results of the Bayesian-NHMM model when k takes different alternative values, and obtaining different performances of a plurality of groups of models, wherein the CVRMSE has the following calculation formula:
wherein: np is the forecast days; s is S i A forecast value for precipitation on the i th day; r is R i Is the observed value of precipitation on the i th day.
Finally, a model with the minimum CVRMSE is selected, the model is determined to be an optimal precipitation prediction model, and precipitation is predicted through a final precipitation influence factor according to the optimal precipitation prediction model.
Step S4: and classifying and grading the runoff forecasting factors. The comprehensive influence of climate change and underlying surface change on runoff is considered, relevant hydrological elements (including precipitation and runoff in the past period, air temperature, air speed, air humidity and net radiation quantity in the past period and the future period), SIF, LAI and Bayesian-NHMM model forecast precipitation values are selected as clustering variables (training samples), and a K-means clustering method is adopted to realize classification and grading of runoff forecasting factors.
Specifically, step S4 includes:
the K-means clustering algorithm is a classical partitioned clustering method, and has the advantages of simple principle, high convergence speed and good clustering effect, and meanwhile, the algorithm has strong interpretation degree, and the parameter to be adjusted is only the cluster number of the clusters. The basic principle of K-means is: for a given sample set, the sample set is divided into n clusters according to the distance between the samples, with the goal of having the points within the clusters as closely connected together as possible, and having the distance between the clusters as large as possible.
In the embodiment, the most important factor affecting daily runoff is precipitation, so that the precipitation value of the next period of time is predicted by an optimal precipitation prediction model, and then the obtained precipitation value is classified into 3 types according to low, medium and high according to the needs, namely the clustering number is preliminarily determined to be 3; and then calculating the distance between other factors and 3 classes according to a K-means clustering algorithm, so as to classify all training samples and obtain 3 classes of subsamples.
Step S5: radial flow probability prediction coupled with a Bayesian-NHMM-artificial intelligence model. Firstly, based on a runoff forecasting factor classification grading result, respectively taking 3 types of subsamples as input of an artificial neural network ANN, a support vector machine SVM, a random forest model RF and a long-short-term memory model LSTM, respectively taking measured runoffs of each section as model simulation prediction targets, respectively training four models, selecting a Mean Square Error (MSE) as a loss function in the model training process, downscaling a physical factor into the section simulation runoffs, and obtaining each section simulation and forecasting runoff result of a research area based on different models; and then weighting and averaging the model forecasting results based on the root mean square error reciprocal to obtain the runoff forecasting results of each section of the final research area.
Specifically, step S5 includes:
s5.1: after preliminary screening is carried out on physical factors influencing runoff formation and final influence factors are obtained, the physical factors are downscaled into section simulated runoffs based on an artificial neural network ANN, a support vector machine SVM, a random forest model RF and a long-short-term memory model LSTM, and future runoff forecasting work is respectively carried out.
Wherein, the long-short-period memory model LSTM is added with a gate (forgetting gate f) for judging whether the information is useful or not on the basis of the cyclic neural network RNN t Input gate i t And an output gate o t ) Writing, resetting functions and reading of cells are achieved. Wherein:
forgetting the door: deciding whether the past information is saved or discarded, the decision being made by including the previous output h t-1 And current input X t Sigmoid layer decision of (2):
f t =σ(W f ·[h t-1 ,X t ]+b f ) (5)
wherein: σ is the sigmoid function, wf and bf are the weight and bias of the forgetting gate, respectively.
An input door: the input gate determines the information stored in the LSTM cell. The value i of the input gate t Based on a signal containing the previous output h t-1 And current input X t Sigmoid layer acquisition of (1):
in C' t Is a candidate vector of new cell state values, consisting of a vector containing h t-1 And X t Is determined by the tanh function of (2). f (f) t ,C t-1 ,i t And C' t Obtain an update of LSTM units:
C t =f t ·C t-1 +i t ·C′ t (7)
output door: the output gate decides which part of the obtained LSTM update unit is to be output. Firstly, an Sigmoid layer is operated to determine an output state, then the obtained unit state value is normalized to be between-1 and 1 through a tanh function, and finally, the unit state value is multiplied with the output of the Sigmoid layer:
the presence of these gates allows the LSTM cell to memorize information at different times.
And (3) training the LSTM, the ANN, the SVM and the RF through the 3 types of subsamples obtained in the step (S4) respectively, and obtaining the forecasting result of each section runoff.
S5.2: based on the simulation precision of each machine learning model on runoffs, calculating the predicted runoff weight of each model by taking the inverse of root mean square error as a reference, and obtaining the final predicted runoffs of each section by weighted average:
wherein m is four machine learning models, w m For the m-th machine learning standardized pre-weight, d=1, 2, … N is the total number of days of simulated runoff per section, simS d,m Simulating runoff for the mth model on the d-th day, S d To observe runoff on day d, wf m For the normalized final weights, a sum of four model weights of 1 is ensured.
For a certain class of subsamples, the weight wf obtained by calculation is based on m And weighting the four trained machine learning models to obtain runoff forecasting models corresponding to the types of samples, thereby respectively obtaining the runoff forecasting models corresponding to the three types of samples. The model effectively considers the dynamic influence of climate change and underlying factors on the regional yield and confluence mechanism of research, and can effectively prolong the prediction period and improve the prediction precision of medium-long-term runoffs.
Step S6: mid-long term flood early warning based on probability forecast information. And acquiring relevant data (particularly, the hydrological meteorological factor data measured in real time) of the research area in real time, respectively calculating the similarity between the relevant data and the 3 types of training subsamples, selecting a runoff forecasting model corresponding to the training subsamples with the highest similarity, and inputting the relevant data into the runoff forecasting model to obtain the forecasting runoff. Further, in the period of the forecast period, the probability that the maximum flow (water level) corresponding to the forecast runoff exceeds a given value h can be determined through the forecast runoff and an overrun probability distribution function based on flood characteristic indexes; based on the information, the flood risk early warning grades are divided from low to high, flood risk early warning is carried out, and technical support is provided for a flood prevention decision department to start emergency response of corresponding grades. The surmounting probability distribution function based on the flood characteristic index can be obtained from the historical data of the research area; h is an overrun probability threshold value of risk class division, and can be obtained according to the inquiry of the early warning guide of the flood risk of the medium and small rivers.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (10)
1. A flood forecasting method coupling artificial intelligence and Bayesian theory is characterized by comprising the following steps:
s1, collecting various flood related data of a research area as an alternative rainfall influence factor;
s2, calculating correlation coefficients between each alternative precipitation influence factor and precipitation amount respectively, screening out precipitation influence factors with correlation coefficients exceeding a preset threshold value, and further processing the screened precipitation influence factors through principal component analysis to obtain final precipitation influence factors; constructing a Bayesian-NHMM model, and determining a plurality of alternative values of the precipitation state number k in the Bayesian-NHMM model;
s3, inputting a final precipitation influence factor into the Bayesian-NHMM model when k takes different alternative values, carrying out precipitation prediction, and taking the Bayesian-NHMM model with the minimum precipitation prediction error as an optimal precipitation prediction model;
s4, constructing a training sample based on a series of rainfall values predicted by the optimal rainfall prediction model, and classifying the training sample to obtain N types of training subsamples;
s5, training the machine learning model through N training subsamples to obtain N trained machine learning models serving as runoff forecasting models; and realizing flood forecast based on the runoff forecast model.
2. The flood forecast method of claim 1, wherein the flood related data comprises macro-scale climate factor data, hydro-meteorological factor data and underlying effect factor data, wherein the macro-scale climate factor data comprises an ENSO index, an atmospheric flow index and an indian ocean dipole, the hydro-meteorological factor data comprises study area runoff, precipitation, air temperature, wind speed, air humidity and net radiation, and the underlying effect factor data comprises study area leaf area index and sunlight induced chlorophyll fluorescence.
3. The flood forecast method of claim 2, wherein step S4 specifically comprises:
building a training sample: the training sample comprises a rainfall value predicted by the optimal rainfall prediction model, and hydrological factor data and underlying effect factor data;
and dividing the rainfall value predicted by the optimal rainfall prediction model into N classes according to the height, and further dividing other training samples into N classes through a K-means clustering method to obtain N classes of training subsamples.
4. The flood forecast method of claim 1, wherein in step S2, the method for determining the final precipitation impact factor is specifically:
(1) Calculating correlation coefficients between each alternative precipitation predictor and precipitation amount respectively;
(2) Selecting a precipitation predictor with a correlation coefficient exceeding a preset threshold value, and processing the screened precipitation predictor through principal component analysis;
(3) Performing k-fold cross validation on the processed rainfall forecast factors, and if the validation is not passed, adjusting a preset threshold value and returning to the step (2); and if the verification is passed, the precipitation predictor after the treatment is the final precipitation influence factor.
5. The flood forecast method of claim 1, wherein in step S2, when calculating the correlation coefficient, if the candidate precipitation predictor and the precipitation amount are in a linear correlation relationship, a Pearson correlation coefficient is adopted; and if the alternative precipitation prediction factor and the precipitation amount are in a nonlinear correlation relationship, adopting a mutual information index as a correlation coefficient.
6. The flood forecast method of claim 1, wherein in step S2, a number of alternative values of the number k of precipitation states in the Bayesian-NHMM model are determined, specifically:
presetting a plurality of k values, calculating BIC function values corresponding to the Bayesian-NHMM model when different k values, thereby establishing a correlation curve graph of the k values and the BIC function values, and taking the k values corresponding to inflection points in the correlation curve graph as alternative values.
7. The flood forecast method of claim 1, wherein in step S5, the number of machine learning models is four, specifically an artificial neural network, a support vector machine, a random forest model and a long-term and short-term memory model;
respectively inputting any type of training sub-sample into four machine learning models for training to obtain four trained machine learning models as runoff forecasting sub-models; and further determining the weight of each runoff forecasting sub-model according to the simulation precision of each runoff forecasting sub-model on the runoffs, thereby obtaining the runoff forecasting model corresponding to the training sub-sample.
8. The flood forecast method coupling artificial intelligence and bayesian theory according to any of claims 1-7, wherein implementing flood forecast based on a runoff forecast model comprises:
and acquiring relevant data of the research area in real time, respectively calculating the similarity of the relevant data and N types of training subsamples, selecting a runoff forecasting model corresponding to the training subsamples with the highest similarity, and inputting the relevant data into the runoff forecasting model to obtain forecasting runoffs, thereby realizing flood forecasting.
9. A system for implementing the flood forecasting method of coupling artificial intelligence and bayesian theory according to any one of claims 1-8, comprising:
the database construction module is used for collecting various flood related data of the research area and taking the flood related data as an alternative precipitation influence factor;
the first parameter estimation module is used for respectively calculating correlation coefficients between each candidate precipitation influence factor and precipitation amount, screening out precipitation influence factors with the correlation coefficients exceeding a preset threshold value, and further processing the screened precipitation influence factors through principal component analysis to obtain final precipitation influence factors; constructing a Bayesian-NHMM model, and determining a plurality of alternative values of the precipitation state number k in the Bayesian-NHMM model;
the precipitation prediction model construction module is used for inputting the final precipitation influence factors into the Bayesian-NHMM model when k takes different alternative values, carrying out precipitation prediction, and taking the Bayesian-NHMM model with the minimum precipitation prediction error as an optimal precipitation prediction model;
the second parameter estimation module is used for constructing a training sample based on a series of rainfall values predicted by the optimal rainfall prediction model, classifying the training sample, and obtaining N types of training subsamples;
the combined forecasting module is used for training the machine learning models through the N training subsamples respectively to obtain N trained machine learning models as runoff forecasting models, and flood forecasting is achieved based on the runoff forecasting models.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a flood forecasting method of any one of claims 1-8 coupled with artificial intelligence and bayesian theory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310592943.7A CN116663404A (en) | 2023-05-24 | 2023-05-24 | Flood forecasting method and system coupling artificial intelligence and Bayesian theory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310592943.7A CN116663404A (en) | 2023-05-24 | 2023-05-24 | Flood forecasting method and system coupling artificial intelligence and Bayesian theory |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116663404A true CN116663404A (en) | 2023-08-29 |
Family
ID=87709022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310592943.7A Pending CN116663404A (en) | 2023-05-24 | 2023-05-24 | Flood forecasting method and system coupling artificial intelligence and Bayesian theory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116663404A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689042A (en) * | 2024-02-01 | 2024-03-12 | 河海大学 | Drainage basin runoff forecasting method based on interpretative machine learning |
-
2023
- 2023-05-24 CN CN202310592943.7A patent/CN116663404A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689042A (en) * | 2024-02-01 | 2024-03-12 | 河海大学 | Drainage basin runoff forecasting method based on interpretative machine learning |
CN117689042B (en) * | 2024-02-01 | 2024-05-17 | 河海大学 | Drainage basin runoff forecasting method based on interpretative machine learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114626512B (en) | High-temperature disaster forecasting method based on directed graph neural network | |
CN113537600B (en) | Medium-long-term precipitation prediction modeling method for whole-process coupling machine learning | |
CN111665575B (en) | Medium-and-long-term rainfall grading coupling forecasting method and system based on statistical power | |
CN112288164B (en) | Wind power combined prediction method considering spatial correlation and correcting numerical weather forecast | |
CN110648014B (en) | Regional wind power prediction method and system based on space-time quantile regression | |
CN113705877B (en) | Real-time moon runoff forecasting method based on deep learning model | |
CN113468803B (en) | WOA-GRU flood flow prediction method and system based on improvement | |
CN113554466B (en) | Short-term electricity consumption prediction model construction method, prediction method and device | |
CN111767517B (en) | BiGRU multi-step prediction method, system and storage medium applied to flood prediction | |
CN114781538B (en) | Air quality prediction method and system for GA-BP neural network coupling decision tree | |
CN109143408B (en) | Dynamic region combined short-time rainfall forecasting method based on MLP | |
CN113269365B (en) | Short-term air conditioner load prediction method and system based on sparrow optimization algorithm | |
Seifi et al. | Multi-model ensemble prediction of pan evaporation based on the Copula Bayesian Model Averaging approach | |
CN109886496B (en) | Agricultural yield prediction method based on meteorological information | |
CN113537469B (en) | Urban water demand prediction method based on LSTM network and Attention mechanism | |
CN113393057A (en) | Wheat yield integrated prediction method based on deep fusion machine learning model | |
CN116826737A (en) | Photovoltaic power prediction method, device, storage medium and equipment | |
CN114372631A (en) | Data-lacking area runoff prediction method based on small sample learning and LSTM | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN116663404A (en) | Flood forecasting method and system coupling artificial intelligence and Bayesian theory | |
CN116805439A (en) | Drought prediction method and system based on artificial intelligence and atmospheric circulation mechanism | |
CN117993305B (en) | Dynamic evaluation method for river basin land utilization and soil erosion relation | |
CN117313795A (en) | Intelligent building energy consumption prediction method based on improved DBO-LSTM | |
CN115096357A (en) | Indoor environment quality prediction method based on CEEMDAN-PCA-LSTM | |
CN117290673A (en) | Ship energy consumption high-precision prediction system based on multi-model fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |