CN112561135A - Water flow prediction method and device based on machine learning and electronic equipment - Google Patents

Water flow prediction method and device based on machine learning and electronic equipment Download PDF

Info

Publication number
CN112561135A
CN112561135A CN202011375758.5A CN202011375758A CN112561135A CN 112561135 A CN112561135 A CN 112561135A CN 202011375758 A CN202011375758 A CN 202011375758A CN 112561135 A CN112561135 A CN 112561135A
Authority
CN
China
Prior art keywords
data
flow
rainfall
future
iteration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011375758.5A
Other languages
Chinese (zh)
Inventor
王�琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Cresun Innovation Technology Co Ltd
Original Assignee
Xian Cresun Innovation Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Cresun Innovation Technology Co Ltd filed Critical Xian Cresun Innovation Technology Co Ltd
Priority to CN202011375758.5A priority Critical patent/CN112561135A/en
Publication of CN112561135A publication Critical patent/CN112561135A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Abstract

The invention discloses a water flow prediction method and device based on machine learning and an electronic device, wherein the method comprises the following steps: acquiring original rainfall data and original flow data of a preset watershed; preprocessing to respectively obtain processed rainfall data and processed flow data; constructing input data based on the processed rainfall data, the processed flow data and the known rainfall data of each hydrological site in the next P hours; inputting input data into a flood flow prediction model obtained by pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of a target hydrological station in one hour in the future, and reconstructing the input data of the iteration aiming at the next hour in the future by using the obtained flow prediction value of the hour in the future until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; the method can accurately predict the flood flow condition in a future time period.

Description

Water flow prediction method and device based on machine learning and electronic equipment
Technical Field
The invention belongs to the field of flood prediction, and particularly relates to a water flow prediction method and device based on machine learning and an electronic device.
Background
Flood is one of common natural disasters, hundreds of millions of people are influenced by the flood every year, and run away and lose places, and the financial and material resources loss caused by the flood is also very huge. The flood flow can be effectively predicted, and early warning can be timely sent out, so that the method has great significance for flood control and disaster reduction.
The current flood flow prediction models are mainly divided into traditional physical models and intelligent flood prediction models. The traditional physical model, such as the Xinanjiang model, is a set of prediction models with regional pertinence, which are finally prepared by calculating parameters of a physical process on the premise of fully excavating physical characteristics such as local landform, evaporation capacity, vegetation coverage and the like. The intelligent flood prediction model is a function mapping or joint distribution from input features to output features obtained by using intelligent methods such as machine learning and the like by using massive historical data as prior knowledge.
However, the existing flood flow prediction model mostly belongs to single-point prediction, that is, the flow condition of a future time point is predicted, and in an actual situation, the predicted flow data of the single time point lacks practical application value.
Disclosure of Invention
The embodiment of the invention aims to provide a water flow prediction method and device based on machine learning and electronic equipment, so as to realize the purpose of predicting the flood flow in a future time period. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a water flow prediction method based on machine learning, where the method includes:
acquiring original rainfall data and original flow data of a preset watershed; the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of the preset watershed, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at an outlet section of the preset watershed from the N hydrological stations;
preprocessing the original rainfall data and the original flow data to respectively obtain processed rainfall data and processed flow data;
constructing input data based on the processed rainfall data, the processed flow data and the pre-acquired known rainfall data of each hydrological site in the future P hours;
inputting the input data into a flood flow prediction model obtained by pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of the target hydrological station in one hour in the future, and reconstructing the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the hour in the future until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; wherein N, K and P are natural numbers greater than 1.
Optionally, the preprocessing the original rainfall data and the original flow data to obtain processed rainfall data and processed flow data respectively includes:
performing data elimination and data completion processing on the original rainfall data to obtain completed rainfall data;
performing normalization processing on the supplemented rainfall data to obtain processed rainfall data;
and carrying out normalization processing on the original flow data to obtain processed flow data.
Optionally, the removing data and complementing data of the original rainfall data to obtain complemented rainfall data includes:
removing data corresponding to hydrological sites with the data number lower than the preset number in the original rainfall data to obtain residual rainfall data;
and performing data completion on the rainfall data missing from the residual rainfall data by using an inverse distance weighting method to obtain completed rainfall data.
Optionally, the normalization process includes a [0,1] normalization process.
Optionally, the constructing input data based on the processed rainfall data, the processed flow data, and the pre-obtained known rainfall data of each hydrological site in the future P hours includes:
determining first normalized rainfall sum data of the N hydrological stations corresponding to each hour in T hours based on the processed rainfall data; determining second normalized rainfall sum data of the N hydrological stations corresponding to each hour in the future P hours based on the known rainfall data of the hydrological stations in the future P hours, and forming rainfall data components by the first normalized rainfall sum data and the second normalized rainfall sum data;
selecting data corresponding to T hours from the processed flow data, and forming a flow data component with the flow data to be predicted of the target hydrological site, which is filled with 0 and is in the P hours in the future;
and forming input data with a dimensionality of [ T + P, 2] by the rainfall data component and the flow data component, wherein T is a natural number larger than 1, and T is smaller than or equal to the number of hours corresponding to the K year.
Optionally, the inputting the input data into a flood flow prediction model obtained by pre-training for performing cyclic iterative prediction, extracting timing characteristics from the input data of the iteration at each iteration, performing classification prediction to obtain a flow prediction value of the target hydrologic site for one hour in the future, and reconstructing the input data for the iteration of the next hour in the future by using the obtained flow prediction value of the hour in the future, includes:
aiming at the ith iteration, extracting time sequence characteristics of rainfall data and flow data corresponding to the latest T hours from input data of the iteration by using three layers of one-dimensional convolution layers of the flood flow prediction model, and performing classification prediction by using three layers of full-connected layers of the flood flow prediction model to obtain a flow prediction value of the target hydrological site in the ith hour in the future;
and storing the predicted flow value of the ith hour in the future in a result list, and when i is smaller than P, replacing the flow data with the position of 0 in the input data of the iteration to obtain the input data of the (i +1) th iteration, wherein i is 1,2, … and P.
Optionally, the number of convolution kernels of the three one-dimensional convolution layers is 50, 30, and 10, respectively; the convolution kernels of the three one-dimensional convolution layers are all 5 multiplied by 2; the number of the neurons of the three full-connection layers is respectively 10, 10 and 1.
In a second aspect, an embodiment of the present invention provides a water flow predicting apparatus based on machine learning, where the apparatus includes:
the system comprises an original data acquisition module, a data acquisition module and a data acquisition module, wherein the original data acquisition module is used for acquiring original rainfall data and original flow data of a preset watershed; the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of the preset watershed, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at an outlet section of the preset watershed from the N hydrological stations;
the data preprocessing module is used for preprocessing the original rainfall data and the original flow data to respectively obtain processed rainfall data and processed flow data;
the input data construction module is used for constructing input data based on the processed rainfall data, the processed flow data and the pre-acquired known rainfall data of each hydrological site in the future P hours;
the input data prediction module is used for inputting the input data into a flood flow prediction model obtained by pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of the target hydrological station in one hour in the future, and reconstructing the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the next hour until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; wherein N, K and P are natural numbers greater than 1.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory, wherein,
the memory is used for storing a computer program;
the processor is configured to implement the steps of the water flow prediction method based on machine learning according to the embodiment of the present invention when executing the program stored in the memory.
In a fourth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the water flow prediction method based on machine learning provided by the embodiment of the present invention.
In the scheme provided by the embodiment of the invention, a flood flow prediction model is established based on a cyclic recursion methodology, input data of a pre-trained flood flow prediction model is established by utilizing rainfall data and flow data of years in historical data and known rainfall data of P hours in the future as prior data, through multiple cyclic iterations, input data of iteration of the next hour in the future is established by utilizing the flow prediction value of one hour in the future obtained by each prediction, and through P iterations, the flow prediction value of P hours in the future can be output by the model once, so that the purpose of predicting the flood flow of one time period in the future is realized.
Drawings
Fig. 1 is a schematic flow chart of a water flow prediction method based on machine learning according to an embodiment of the present invention;
FIG. 2 is an exemplary diagram of input data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a flood flow prediction model according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a prediction process according to an embodiment of the present invention;
FIG. 5 is a comparison graph of flood peak prediction effects according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a water flow predicting apparatus based on machine learning according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to achieve the purpose of flood flow prediction in a future time period, embodiments of the present invention provide a water flow prediction method and apparatus based on machine learning, an electronic device, and a storage medium.
It should be noted that the execution subject of the water flow prediction method based on machine learning according to the embodiment of the present invention may be a water flow prediction apparatus based on machine learning, which may be run in an electronic device. The electronic device may be a server or a terminal device, but is not limited thereto.
First, a water flow prediction method based on machine learning according to an embodiment of the present invention will be described.
In a first aspect, as shown in fig. 1, a method for predicting water flow based on machine learning according to an embodiment of the present invention may include the following steps:
s1, acquiring original rainfall data and original flow data of a preset watershed;
the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of a preset basin, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at the outlet section of the preset basin in the N hydrological stations;
the preset watershed is a geographical area, the geographical area comprises N hydrological stations and is used for monitoring rainfall conditions, and one target hydrological station in the N hydrological stations is located on an outlet section of the preset watershed and is used for monitoring the change conditions of the water level and the flow of the outlet section of the preset watershed. Therefore, in the embodiment of the invention, the historical rainfall data in each hour of K years can be acquired from N hydrological sites to generate the original rainfall data, and the historical water flow data in each hour of K years can be acquired from the target hydrological site to generate the original water flow data. As a preferred embodiment, the historical K years may be consecutive K years in the historical data that are closest to the current time. Wherein N and K are natural numbers greater than 1.
In an optional embodiment, the raw rainfall data may further include total rainfall data of all hydrological sites per hour.
As a specific example of the embodiment of the present invention, a county basin in henan province of china is selected as the predetermined basin, the county basin has 50 hydrologic sites in total, and the time range may be selected from 2013 to 2018, that is, N is 50 and K is 6. Examples of the obtained raw rainfall data and raw flow data are shown in tables 1 and 2, and the specific numerical values are not shown as a formal example. Wherein TM is a time stamp and takes one hour as a unit; S1-S50 show rainfall data of 50 hydrological stations; SUM represents the rainfall summary data for 50 hydrologic sites. Q represents the flow data of the target hydrological site, and S1-S50, SUM and Q are all in millimeter.
Table 1 original rainfall data example table
TM S1 S2 S50 SUM
1 month, 1 day, 0 of 2013 / / / / /
1 month and 1 day of 2013 / / / / /
…… / / / / /
31, 23 of 12 and 2013 / / / / /
…… / / / / /
31, 23 and 12 months in 2018 / / / / /
Table 2 original flow data example table
TM Q
1 month, 1 day, 0 of 2013 /
1 month and 1 day of 2013 /
…… /
31, 23 of 12 and 2013 /
…… /
31, 23 and 12 months in 2018 /
As will be appreciated by those skilled in the art, the overall table dimension of the raw rainfall data illustrated in table 1 is [ (6 × 365 × 24+1), 52] ═ 52561,52 ]. Table 2 the entire table dimension of the raw traffic data is [52561,2] for the example.
S2, preprocessing the original rainfall data and the original flow data to respectively obtain processed rainfall data and processed flow data;
in an alternative embodiment, the step S2 may include steps S21 to S23:
s21, performing data elimination and data completion processing on the original rainfall data to obtain completed rainfall data;
in practice, due to careless omission and the like, rainfall data of a hydrological site may be missing, and it can be understood that if the calculation is performed by using raw rainfall data with excessive missing data, the accuracy of subsequent prediction is affected, and therefore, in this step, data of the hydrological site with excessive missing data needs to be removed first.
S21 specifically includes the following two steps:
removing data corresponding to hydrological sites with the data number lower than the preset number in original rainfall data to obtain residual rainfall data;
for example, for the raw rainfall data of table 1, if the number of rainfall data at a hydrologic site is lower than the preset number 30000, it indicates that the existing data ratio is smaller than
Figure BDA0002808150280000091
This means that almost half of the data is missing, and it is not necessary to complete missing values in the subsequent process for such a large amount of missing data. Because if missing value completion is carried out on the model, too many artificial factors are introduced, so that the generalization capability of the model is reduced. So that the rules learned by the final neural network do not conform to the real data relationship. Therefore, data corresponding to hydrologic sites with more missing data can be removed, rainfall data of the preferable hydrologic sites are reserved, and residual rainfall data are obtained.
The preset quantity can be reasonably selected according to the dimension of the original rainfall data and the requirement of data precision.
And step two, complementing the rainfall data missing from the residual rainfall data by using an inverse distance weighting method to obtain complemented rainfall data.
It can be understood by those skilled in the art that there may still be some missing rainfall data at some time points in the remaining rainfall data, and the missing rainfall data can be complemented by using a missing value complementing algorithm.
Among them, the Inverse Distance Weighted (IDW) algorithm is a classic algorithm for missing value completion. The anti-distance weighting algorithm completes the missing rainfall data of a certain hydrological site by using the rainfall data of the adjacent hydrological sites according to the principle that the farther the distance is, the smaller the correlation is. The formula of the inverse distance weighting algorithm is shown in the following formula (1).
Figure BDA0002808150280000092
Wherein x is a rainfall data estimation value of a hydrological site with data loss; m is the number of hydrological stations participating in calculation; x is the number ofjThe rainfall data actual value of the adjacent hydrological stations is obtained; djCalculating the actual distance of the hydrological station with the missing distance data of the adjacent hydrological stations participating in calculation; p is a norm type for calculating the distance vector, and can generally take a value of 2.
The rainfall data of the hydrological station lacking the rainfall data in the residual rainfall data can be supplemented by utilizing the inverse distance weighting algorithm, and complete rainfall sum data is obtained, so that the supplemented rainfall data is obtained. I.e. the data in table 1 is filled in completely.
S22, performing normalization processing on the supplemented rainfall data to obtain processed rainfall data;
the step is to perform normalization processing on rainfall data of all hydrological stations in the supplemented rainfall data, and perform independent normalization processing on rainfall sum data.
And S23, carrying out normalization processing on the original flow data to obtain processed flow data.
Since the normalization processing methods in steps S22 and S23 are the same, they will be described together. Normalization is a dimensionless processing means to make the absolute value of the physical system value become some relative value relation. The method is an effective method for simplifying calculation and reducing the magnitude. Through proper normalization processing, the speed of solving the optimal solution by gradient descent can be increased, the magnitude of the model parameter value is reduced, the convergence speed in the model training process is increased, the performance of the model is improved, and the precision is improved.
Common normalization methods include: min-max normalization, standard deviation normalization, non-linear normalization, etc. Since the rainfall data and the flow data do not satisfy the normal distribution, the normalization process includes a [0,1] normalization process in a preferred embodiment of the present invention.
The [0,1] normalization process belongs to min-max normalization, which is performed by re-adjusting the values of each dimension of the data so that the final data vector falls between [0,1 ]. [0,1] the specific formula of the normalization process is shown in the following formula (2).
Figure BDA0002808150280000101
Wherein, ymaxAnd yminMaximum and minimum values in the data samples, respectively; y is the raw data before normalization, y*Is the normalized data.
Through the normalization processing of the [0,1], the processed rainfall data after the normalization processing of the supplemented rainfall data and the processed flow data after the normalization processing of the original flow data can be obtained.
S3, constructing input data based on the processed rainfall data, the processed flow data and the pre-acquired known rainfall data of each hydrological site in the future P hours;
in an alternative embodiment, the step of S3 may be divided into the following steps S31 to S33:
s31, determining first normalized rainfall sum data of N hydrological stations corresponding to each hour in T hours based on the processed rainfall data; determining second normalized rainfall sum data of N hydrological stations corresponding to each hour in the future P hours based on the known rainfall data of the hydrological stations in the future P hours, and forming rainfall data components by the first normalized rainfall sum data and the second normalized rainfall sum data; wherein T is a natural number more than 1, and T is less than or equal to the number of hours corresponding to K years. I.e., T ≦ K × 365 × 24, such as T may be 100 hours, and so on. And in a preferred embodiment T hours are consecutive T hours closest to the current time.
It is understood that if the raw rainfall data is shown in table 1, and includes SUM column, then the normalized rainfall SUM data for each of the N hydrologic sites over the past T hours can be obtained from the processed rainfall data through the processing of the previous step, which is named as the first normalized rainfall SUM data for the sake of distinction in the embodiments of the present invention.
If the original rainfall data does not have the SUM column shown in table 1 and only contains rainfall data of each hydrologic site, the processed rainfall data obtained through the previous step is rainfall data of each hydrologic site after completion of supplementation and normalization, and at this time, the step S31 may also calculate normalized rainfall SUM data of each hour of the N hydrologic sites in the past T hours, that is, first normalized rainfall SUM data, by using the processed rainfall data.
The future time period P is a future time period to be predicted, P is a natural number greater than 1, for example, P may be 24 hours, and the like. The embodiment of the invention can acquire rainfall data of each hydrological site in 24 hours in the future by means of weather forecast and the like, sum of the rainfall data of all the hydrological sites in each hour in the 24 hours in the future, namely rainfall sum data of each hour, and then perform normalization processing on the 24 rainfall sum data to obtain normalized rainfall sum data of all the hydrological sites corresponding to each hour in the 24 hours in the future, and the embodiment of the invention is named as second normalized rainfall sum data. This data is taken as known data, i.e. a priori data.
Then, the first normalized rainfall sum data and the second normalized rainfall sum data are arranged according to the time sequence to form a rainfall data component with the dimensionality of [ T + P,1 ].
S32, selecting data corresponding to T hours from the processed flow data, and forming a flow data component with the flow data to be predicted of the target hydrological station filled with 0 for the next P hours;
since the flow data of the target hydrological site P hours in the future is data which needs to be predicted, in order to maintain a data dimension consistent with the rainfall data component, the data can be filled with 0 first. Selecting data corresponding to T hours from the processed flow data, and arranging the data and flow data to be predicted of a target hydrological site filled with 0 for P hours in the future according to the time sequence to form a flow data component with the dimensionality of [ T + P,1 ];
and S33, forming input data with the dimension of [ T + P, 2] by the rainfall data component and the flow data component.
To facilitate understanding of the form of the input data, please refer to fig. 2, and fig. 2 is an exemplary diagram of the input data according to the embodiment of the present invention. The figure is merely a formal example, and specific data is not shown. Where 0 represents the current time. It will be appreciated that the resulting input data is a vector with dimensions [ T + P, 2 ]. The text description in fig. 2 is simplified and illustrated. In fig. 2, the first normalized rainfall sum data and the second normalized rainfall sum data are respectively shown at the left and right of the lower line.
In the embodiment of the present invention, the sequence of S31 and S32 is not limited.
S4, inputting input data into a flood flow prediction model obtained through pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of a target hydrological station in one hour in the future, and reconstructing the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the hour in the future until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; wherein N, K and P are natural numbers greater than 1.
To facilitate understanding of the prediction process of the embodiment of the present invention, first, a structure and a training process of the flood flow prediction model of the embodiment of the present invention are described.
In the embodiment of the invention, in order to realize flood flow prediction in a future time period, a specific flood flow prediction model is provided by utilizing one-dimensional convolution on the basis of a cyclic recursion methodology.
The recursive methodology is shown in the following equation (3).
Figure BDA0002808150280000131
For the inventive examples, [ q ] in formula (3)H...q0]Can represent the existing flow data, and the invention combines the existing flow data with rainfall data to obtain the flow data q of one hour in the future1Making a prediction, and then making the predicted q1As known conditions, use is made of qH+1...q1For the next hour of traffic data q2Prediction … … is performed, and P iterations are performed until a flow prediction result is obtained P hours in the future.
One-dimensional convolution refers to performing a convolution kernel shift computation in one dimension on a given input data, with the output of one convolution kernel being a one-dimensional vector. The one-dimensional convolution is mainly applied to analysis of a time series, and in a flood prediction task, rainfall and flow serving as input data are data with a very obvious time sequence relation. It is appropriate to use one-dimensional convolution to extract its features. Compared with the LSTM structured network which is excellent in the field of time series analysis, the LSTM structured network has the advantages of simpler structure and faster operation.
The embodiment of the invention builds a flood flow prediction model based on python, and the flood flow prediction model comprises three one-dimensional convolution layers and three full-connection layers, the specific structure is shown in fig. 3, and fig. 3 is a schematic structural diagram of the flood flow prediction model of the embodiment of the invention. Where Conv1D denotes a one-dimensional convolutional layer and Dense denotes a fully-connected layer. The parameters for the specific layers are shown in Table 3.
TABLE 3 layer parameters of flood flow prediction model
Figure BDA0002808150280000141
The flood flow prediction model is obtained by performing loop iterative training on rainfall data and flow data in the previous T hours and rainfall data and flow data in the next P hours in historical data of N hydrologic sites.
For example, the sample input data may be constructed by using 100-hour rainfall data and flow data of 2013 to 2017 years in the historical data and known rainfall data for a time period to be predicted, which is 24 hours (refer to the above). And (3) performing cyclic iterative training by taking the known flow data aiming at the time period to be predicted as a true value, for example, iterating for 100 times until the trained model parameters are obtained, so as to obtain the trained flood flow prediction model. The process is outlined as follows with respect to model training.
1) Each sample data and the corresponding true value in the sample input data are trained through the initial network model shown in the structure of fig. 3, and the training result of each sample data is obtained.
2) And comparing the training result of each sample data with the true value corresponding to the sample data to obtain the prediction result corresponding to the sample data.
3) And calculating the loss value of the network model according to the prediction result corresponding to each sample data.
4) And adjusting parameters of the network model according to the loss value, and repeating the steps 1) -3) until the loss value of the network model reaches a certain convergence condition, namely the loss value reaches the minimum value, which means that the training result of each sample data is consistent with the true value corresponding to the sample data, thereby completing the training of the network model and obtaining the trained flood flow prediction model.
Specifically, the training process adopts an optimization algorithm of adam (adaptive motion estimation) gradient descent to obtain a better loss descent process. The loss function takes the form of MSE (mean-square error). The learning rate was 0.01. Meanwhile, Dropout operation is added, Dropout being a parameter regularization means applied in a deep learning environment. Can be used as a kind of trigk for training the deep neural network. By ignoring half of the feature detectors (letting half of the hidden layer node values be 0) in each training batch, the overfitting phenomenon can be significantly reduced.
After the training is finished, input data to be predicted can be input into a flood flow prediction model obtained through pre-training for cyclic iterative prediction. Specifically, the method comprises the following steps:
aiming at the ith iteration, extracting time sequence characteristics of rainfall data and flow data corresponding to the latest T hours from input data of the iteration by utilizing three layers of one-dimensional convolution layers of a flood flow prediction model, and performing classification prediction by utilizing three layers of full connection layers of the flood flow prediction model to obtain a flow prediction value of the target hydrological station in the ith hour in the future;
and storing the predicted flow value of the ith hour in the future in a result list, and when i is less than P, replacing the flow data with the position of 0 in the input data of the iteration to obtain the input data of the (i +1) th iteration, wherein i is 1,2, … and P.
Specific processes can be seen in fig. 4, and fig. 4 is a schematic diagram of a prediction process according to an embodiment of the present invention. Let T be 100, P be 24, and N be 50.
Inputting input data with the dimensionality of [124, 2] into a flood flow prediction model as input data of the 1 st iteration, and when i is equal to 1, namely the 1 st iteration, specifically selecting the latest data corresponding to T hours, namely 100-hour flow data and rainfall data to participate in the iteration. Extracting time sequence characteristics by utilizing three layers of one-dimensional convolutional layers, performing classification prediction by utilizing three layers of fully-connected layers to obtain a flow prediction value of the next 1 hour, and storing the flow prediction value in a result list; and at this point it is decided that i < 24, the input data for the 2 nd iteration is reconstructed using the predicted flow value for the 1 st hour in the future. Specifically, the flow predicted value of the 1 st hour in the future, which is filled with 0 in the original input data, is replaced with the flow predicted value of the 1 st hour in the future, which is predicted by the 1 st iteration, that is, the black rectangular part in fig. 4, so as to obtain the input data which can be used for the 2 nd iteration. When i is 2, that is, at the 2 nd iteration, the latest 100-hour corresponding data is still selected from the updated input data, which can be visually understood as that the 100-hour data shown by the dashed line box is shifted to the right by one data column, and the leftmost data of the input data for the 1 st iteration is discarded. Namely, the predicted flow value obtained by prediction is used for the prediction process of the next iteration each time, and the data dimension of each iteration is kept unchanged. And the loop is iterated until the prediction for 24 hours is completed. Aiming at the flood flow prediction model provided by the embodiment of the invention, the flow prediction value of 24 hours in the future can be obtained at one time through the result list, and the flood flow prediction in a time period in the future is realized.
The data dimensions after the layers are processed are labeled in fig. 4 to facilitate understanding of data variations. For the specific processing procedure of the one-dimensional convolution layer and the full link layer, please refer to the prior art for understanding, and will not be described herein.
Referring to fig. 5, fig. 5 is a comparison graph of the flood peak prediction effect according to the embodiment of the present invention. Wherein flow is the peak value of flood and is in millimeters. time is time and the unit h represents hours. real-q is the true value of the flood peak value, and pred-q is the predicted value of the flood peak value, so that the water flow prediction method based on machine learning can obtain a more accurate prediction result.
It can be understood that the flow prediction value obtained by the flood flow prediction model in the embodiment of the present invention is also normalized data, and then, optionally, a numerical value having the same data specification as the actually measured flow value can be obtained by using corresponding data processing. This will not be described in detail.
In an optional embodiment, the obtained predicted flow value of the target hydrologic site in the future P hours may be output, for example, sent to another device or displayed, for example, displayed on a display screen of a predetermined electronic device.
In an optional implementation manner, the obtained predicted flow value of the target hydrologic site in the next P hours may be compared with a predetermined threshold, and when the predicted flow value of the target hydrologic site in the next P hours is greater than or equal to the predetermined threshold, warning information is sent.
In the scheme provided by the embodiment of the invention, a flood flow prediction model is established based on a cyclic recursion methodology, input data of a pre-trained flood flow prediction model is established by utilizing rainfall data and flow data of years in historical data and known rainfall data of P hours in the future as prior data, through multiple cyclic iterations, input data of iteration of the next hour in the future is established by utilizing the flow prediction value of one hour in the future obtained by each prediction, and through P iterations, the flow prediction value of P hours in the future can be output by the model once, so that the purpose of predicting the flood flow of one time period in the future is realized.
In a second aspect, corresponding to the above method embodiment, an embodiment of the present invention further provides a water flow predicting apparatus based on machine learning, as shown in fig. 6, where the apparatus includes:
an original data acquisition module 601, configured to acquire original rainfall data and original flow data of a predetermined drainage basin; the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of a preset basin, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at an outlet section of the preset basin in the N hydrological stations;
a data preprocessing module 602, configured to preprocess the original rainfall data and the original flow data to obtain processed rainfall data and processed flow data, respectively;
the input data construction module 603 is configured to construct input data based on the processed rainfall data, the processed flow data, and pre-acquired known rainfall data of each hydrological site in the future P hours;
the input data prediction module 604 is configured to input data into a pre-trained flood flow prediction model for performing cyclic iteration prediction, extract timing characteristics from the input data of the iteration and perform classification prediction during each iteration to obtain a flow prediction value of a target hydrological site in one hour in the future, reconstruct the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the hour in the future until P iterations are completed, and obtain a flow prediction value of the target hydrological site in P hours in the future; wherein N, K and P are natural numbers greater than 1.
In the scheme provided by the embodiment of the invention, a flood flow prediction model is established based on a cyclic recursion methodology, input data of a pre-trained flood flow prediction model is established by utilizing rainfall data and flow data of years in historical data and known rainfall data of P hours in the future as prior data, through multiple cyclic iterations, input data of iteration of the next hour in the future is established by utilizing the flow prediction value of one hour in the future obtained by each prediction, and through P iterations, the flow prediction value of P hours in the future can be output by the model once, so that the purpose of predicting the flood flow of one time period in the future is realized.
In a third aspect, an embodiment of the present invention further provides an electronic device, as shown in fig. 7, including a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the steps of the water flow prediction method based on machine learning according to the first aspect when executing the program stored in the memory 703.
The electronic device may be: desktop computers, laptop computers, intelligent mobile terminals, servers, and the like. Without limitation, any electronic device that can implement the present invention is within the scope of the present invention.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
Through above-mentioned electronic equipment, can realize: the method comprises the steps of establishing a flood flow prediction model based on a cyclic recursion methodology, establishing input data of a pre-trained flood flow prediction model by utilizing rainfall data and flow data of years in historical data and known rainfall data of P hours in the future as prior data, establishing input data of iteration of the next hour in the future by utilizing a flow prediction value of one hour in the future obtained by each prediction through multiple cyclic iteration, and outputting the flow prediction value of P hours in the future by the model through P iterations, so that the purpose of predicting the flood flow of one time period in the future is achieved.
In a fourth aspect, corresponding to the method for predicting water flow based on machine learning provided in the first aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements the steps of the method for predicting water flow based on machine learning provided in the embodiment of the present invention.
The computer-readable storage medium stores an application program that executes the machine learning-based water flow prediction method provided by the embodiment of the present invention when executed, and thus can implement: the method comprises the steps of establishing a flood flow prediction model based on a cyclic recursion methodology, establishing input data of a pre-trained flood flow prediction model by utilizing rainfall data and flow data of years in historical data and known rainfall data of P hours in the future as prior data, establishing input data of iteration of the next hour in the future by utilizing a flow prediction value of one hour in the future obtained by each prediction through multiple cyclic iteration, and outputting the flow prediction value of P hours in the future by the model through P iterations, so that the purpose of predicting the flood flow of one time period in the future is achieved.
For the apparatus/electronic device/storage medium embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
It should be noted that, the apparatus, the electronic device and the storage medium according to the embodiments of the present invention are respectively an apparatus, an electronic device and a storage medium to which the water flow prediction method based on machine learning is applied, and all embodiments of the water flow prediction method based on machine learning are applicable to the apparatus, the electronic device and the storage medium, and can achieve the same or similar beneficial effects.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. A water flow prediction method based on machine learning is characterized by comprising the following steps:
acquiring original rainfall data and original flow data of a preset watershed; the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of the preset watershed, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at an outlet section of the preset watershed from the N hydrological stations;
preprocessing the original rainfall data and the original flow data to respectively obtain processed rainfall data and processed flow data;
constructing input data based on the processed rainfall data, the processed flow data and the pre-acquired known rainfall data of each hydrological site in the future P hours;
inputting the input data into a flood flow prediction model obtained by pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of the target hydrological station in one hour in the future, and reconstructing the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the hour in the future until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; wherein N, K and P are natural numbers greater than 1.
2. The method of claim 1, wherein pre-processing the raw rainfall data and the raw flow data to obtain processed rainfall data and processed flow data, respectively, comprises:
performing data elimination and data completion processing on the original rainfall data to obtain completed rainfall data;
performing normalization processing on the supplemented rainfall data to obtain processed rainfall data;
and carrying out normalization processing on the original flow data to obtain processed flow data.
3. The method according to claim 2, wherein the step of performing data elimination and data completion processing on the original rainfall data to obtain completed rainfall data comprises:
removing data corresponding to hydrological sites with the data number lower than the preset number in the original rainfall data to obtain residual rainfall data;
and performing data completion on the rainfall data missing from the residual rainfall data by using an inverse distance weighting method to obtain completed rainfall data.
4. The method of claim 2, wherein the normalization process comprises a [0,1] normalization process.
5. The method of claim 2, wherein constructing input data based on the processed rainfall data, the processed flow data, and pre-acquired known rainfall data for each hydrological site at P hours in the future comprises:
determining first normalized rainfall sum data of the N hydrological stations corresponding to each hour in T hours based on the processed rainfall data; determining second normalized rainfall sum data of the N hydrological stations corresponding to each hour in the future P hours based on the known rainfall data of the hydrological stations in the future P hours, and forming rainfall data components by the first normalized rainfall sum data and the second normalized rainfall sum data;
selecting data corresponding to T hours from the processed flow data, and forming a flow data component with the flow data to be predicted of the target hydrological site, which is filled with 0 and is in the P hours in the future;
and forming input data with a dimensionality of [ T + P, 2] by the rainfall data component and the flow data component, wherein T is a natural number larger than 1, and T is smaller than or equal to the number of hours corresponding to the K year.
6. The method according to claim 1 or 5, wherein the inputting the input data into a flood flow prediction model obtained by pre-training for cyclic iterative prediction, extracting timing characteristics from the input data of each iteration at each iteration and performing classification prediction to obtain a flow prediction value of the target hydrologic site in one hour in the future, and reconstructing the input data for the iteration in the next hour in the future by using the obtained flow prediction value in the hour in the future, comprises:
aiming at the ith iteration, extracting time sequence characteristics of rainfall data and flow data corresponding to the latest T hours from input data of the iteration by using three layers of one-dimensional convolution layers of the flood flow prediction model, and performing classification prediction by using three layers of full-connected layers of the flood flow prediction model to obtain a flow prediction value of the target hydrological site in the ith hour in the future;
and storing the predicted flow value of the ith hour in the future in a result list, and when i is smaller than P, replacing the flow data with the position of 0 in the input data of the iteration to obtain the input data of the (i +1) th iteration, wherein i is 1,2, … and P.
7. The method of claim 6, wherein the number of convolution kernels of the three one-dimensional convolution layers is 50, 30, 10; the convolution kernels of the three one-dimensional convolution layers are all 5 multiplied by 2; the number of the neurons of the three full-connection layers is respectively 10, 10 and 1.
8. A water flow prediction device based on machine learning, comprising:
the system comprises an original data acquisition module, a data acquisition module and a data acquisition module, wherein the original data acquisition module is used for acquiring original rainfall data and original flow data of a preset watershed; the original rainfall data comprises historical K-year hourly rainfall data of N hydrological stations of the preset watershed, and the original flow data comprises historical K-year hourly flow data of target hydrological stations located at an outlet section of the preset watershed from the N hydrological stations;
the data preprocessing module is used for preprocessing the original rainfall data and the original flow data to respectively obtain processed rainfall data and processed flow data;
the input data construction module is used for constructing input data based on the processed rainfall data, the processed flow data and the pre-acquired known rainfall data of each hydrological site in the future P hours;
the input data prediction module is used for inputting the input data into a flood flow prediction model obtained by pre-training for cyclic iteration prediction, extracting time sequence characteristics from the input data of the iteration and performing classification prediction during each iteration to obtain a flow prediction value of the target hydrological station in one hour in the future, and reconstructing the input data of the iteration of the next hour in the future by using the obtained flow prediction value of the next hour until P iterations are completed to obtain a flow prediction value of the target hydrological station in P hours in the future; wherein N, K and P are natural numbers greater than 1.
9. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;
the memory is used for storing a computer program;
the processor, when executing the program stored in the memory, implementing the method steps of any of claims 1-7.
10. A computer-readable storage medium, characterized in that,
the computer-readable storage medium has stored therein a computer program which, when being executed by a processor, carries out the method steps of any one of claims 1 to 7.
CN202011375758.5A 2020-11-30 2020-11-30 Water flow prediction method and device based on machine learning and electronic equipment Withdrawn CN112561135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011375758.5A CN112561135A (en) 2020-11-30 2020-11-30 Water flow prediction method and device based on machine learning and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011375758.5A CN112561135A (en) 2020-11-30 2020-11-30 Water flow prediction method and device based on machine learning and electronic equipment

Publications (1)

Publication Number Publication Date
CN112561135A true CN112561135A (en) 2021-03-26

Family

ID=75045495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011375758.5A Withdrawn CN112561135A (en) 2020-11-30 2020-11-30 Water flow prediction method and device based on machine learning and electronic equipment

Country Status (1)

Country Link
CN (1) CN112561135A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361819A (en) * 2021-07-08 2021-09-07 武汉中科牛津波谱技术有限公司 Linear prediction method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361819A (en) * 2021-07-08 2021-09-07 武汉中科牛津波谱技术有限公司 Linear prediction method and device
CN113361819B (en) * 2021-07-08 2023-04-07 武汉中科牛津波谱技术有限公司 Linear prediction method and device

Similar Documents

Publication Publication Date Title
Chen et al. Forecast of rainfall distribution based on fixed sliding window long short-term memory
Talebizadeh et al. Uncertainty analysis for the forecast of lake level fluctuations using ensembles of ANN and ANFIS models
Sattari et al. Performance evaluation of artificial neural network approaches in forecasting reservoir inflow
Adnan et al. Stream flow forecasting of poorly gauged mountainous watershed by least square support vector machine, fuzzy genetic algorithm and M5 model tree using climatic data from nearby station
Fahad et al. Implementing a novel deep learning technique for rainfall forecasting via climatic variables: An approach via hierarchical clustering analysis
Ghumman et al. Performance assessment of artificial neural networks and support vector regression models for stream flow predictions
Xie et al. Stacking ensemble learning models for daily runoff prediction using 1D and 2D CNNs
Wan et al. A novel model for water quality prediction caused by non-point sources pollution based on deep learning and feature extraction methods
CN112561134A (en) Neural network-based water flow prediction method and device and electronic equipment
Bai et al. Daily runoff forecasting using a cascade long short-term memory model that considers different variables
Condemi et al. Hydro-power production capacity prediction based on machine learning regression techniques
Sreeparvathy et al. A fuzzy entropy approach for design of hydrometric monitoring networks
CN112529296A (en) Water quality prediction method, device and server
Wegayehu et al. Multivariate streamflow simulation using hybrid deep learning models
Maity et al. Potential of Deep Learning in drought assessment by extracting information from hydrometeorological precursors
Lv et al. An improved long short-term memory neural network for stock forecast
CN112668711B (en) Flood flow prediction method and device based on deep learning and electronic equipment
CN112561132A (en) Water flow prediction model based on neural network
Xiang et al. Fully distributed rainfall-runoff modeling using spatial-temporal graph neural network
Wang et al. Customized deep learning for precipitation bias correction and downscaling
He et al. Short-term runoff prediction optimization method based on bgru-bp and blstm-bp neural networks
Xu et al. Comprehensive analysis for long-term hydrological simulation by deep learning techniques and remote sensing
CN114723188A (en) Water quality prediction method, device, computer equipment and storage medium
Farfán et al. Improving the predictive skills of hydrological models using a combinatorial optimization algorithm and artificial neural networks
CN112561135A (en) Water flow prediction method and device based on machine learning and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210326