CN116128039A - Construction method and prediction method of surface water quality prediction model - Google Patents

Construction method and prediction method of surface water quality prediction model Download PDF

Info

Publication number
CN116128039A
CN116128039A CN202310007638.7A CN202310007638A CN116128039A CN 116128039 A CN116128039 A CN 116128039A CN 202310007638 A CN202310007638 A CN 202310007638A CN 116128039 A CN116128039 A CN 116128039A
Authority
CN
China
Prior art keywords
data
water quality
monitoring
prediction
station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310007638.7A
Other languages
Chinese (zh)
Inventor
张凯
成露
马宏宇
蒋泽虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wpg Shanghai Smart Water Public Co ltd
Original Assignee
Wpg Shanghai Smart Water Public Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wpg Shanghai Smart Water Public Co ltd filed Critical Wpg Shanghai Smart Water Public Co ltd
Priority to CN202310007638.7A priority Critical patent/CN116128039A/en
Publication of CN116128039A publication Critical patent/CN116128039A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a construction method and a prediction method of a surface water quality prediction model, which comprises the steps of obtaining monitoring data and spatial position data related to a plurality of monitoring stations, wherein the monitoring data comprise water quality data; a3, determining the prediction type of the monitoring station based on the spatial position data, reconstructing the monitoring data according to the prediction type of the monitoring station, and performing feature extraction on the reconstructed monitoring data to form time sequence feature data; generating sample data according to the time sequence characteristic data; the pre-constructed water quality prediction model is trained using sample data. By using historical monitoring data and combining reconstruction of the monitoring data by using spatial position data as prediction input data, modeling difficulty and cost are reduced, the dependence on basic data is low, meanwhile, the prediction effect is good, intelligent prediction is realized, and the accuracy of a prediction result is improved.

Description

Construction method and prediction method of surface water quality prediction model
Technical Field
The invention relates to the technical field of surface water quality monitoring, in particular to a construction method and a prediction method of a surface water quality prediction model.
Background
With the rapid development of social economy, the problem of water resource shortage and water source pollution causes the environmental pressure facing the water source to be obviously increased, and the problem of water safety is increasingly highlighted. Most cities taking surface water bodies as water sources in China have the current situations of high ammonia nitrogen concentration, algae proliferation, sudden pollution accidents and the like, so that pretreatment and advanced treatment processes are needed to be added on the basis of conventional processes in water factories to ensure the safety of the water quality of the outgoing plants, meanwhile, the change trend of the raw water quality is concerned at any time, and the water preparation process of the water factories is adjusted timely and accurately. As the first stage safety barrier of the water supply system, it is necessary and critical to establish an accurate and effective surface water body water quality detection and early warning system.
Most surface water body water quality detection and early warning systems are managed by environmental protection or water conservancy departments, professional hydraulic software is adopted to predict and simulate water quality parameters, and if the water quality exceeds a water quality setting threshold value, an alarm signal is sent out.
Problems of the prior art:
1. modeling is complex and costly: most researches and applications adopt a mechanism prediction method, a large amount of basic data materials are needed for modeling, the solving and calculation of water quality control parameters are complex, professional hydraulic software is needed, and the application cost is high.
2. Early warning information lag: the traditional water quality evaluation mainly adopts a manual sampling or online detection mode, most of the traditional water quality evaluation alarms according to thresholds set by national standards or experience methods, analysis is carried out on a single time point, and time sequence change information of water quality cannot be introduced, so that early warning information is delayed or invalid.
3. Poor correlation with water works production: the surface water body water quality detection time interval is longer, and if water pollution occurs, the detection effectiveness is obviously reduced, and the early warning effect cannot be realized. The detection frequency of the instrument should be properly adjusted by combining the actual water production process of the water plant.
Disclosure of Invention
Based on the problems, the invention provides a construction method and a prediction method of a surface water quality prediction model, and aims to solve the technical problems of complex surface water quality prediction calculation, high cost, low efficiency and the like in the prior art.
A construction method of a surface water quality prediction model comprises the following steps:
a1, acquiring monitoring data and space position data related to a plurality of monitoring sites, wherein the monitoring data comprise water quality data and weather data;
a2, preprocessing the monitoring data;
a3, determining the prediction type of the monitoring station based on the spatial position data, reconstructing the preprocessed monitoring data according to the prediction type of the monitoring station, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
step A4, generating sample data according to the time sequence characteristic data;
and step A5, training a pre-constructed water quality prediction model by using sample data, wherein the water quality prediction model is used for predicting the surface water quality of the monitoring station.
Further, after step a31, the method further includes:
step A31, determining the prediction type of the monitoring site based on the spatial position data, and reconstructing the preprocessed monitoring data according to the prediction type of the monitoring site;
a32, carrying out correlation analysis on the reconstructed monitoring data to obtain a characteristic correlation coefficient;
and A33, reserving the characteristic that the correlation coefficient is not smaller than a preset correlation threshold value to form time sequence characteristic data.
Further, the prediction types of the monitored stations include single station prediction, double station prediction and multi-station prediction, and step a31 includes:
if the prediction type of the monitoring station is single-station prediction, only taking the monitoring data of the monitoring station as reconstructed monitoring data;
if the prediction type of the monitoring station is double-station prediction, combining the monitoring data of the upstream monitoring station or the downstream monitoring station corresponding to the monitoring station with the monitoring data of the monitoring station to obtain reconstructed monitoring data;
if the prediction type of the monitoring station is multi-station prediction, combining the monitoring data of the upstream monitoring station and the downstream monitoring station corresponding to the monitoring station and the monitoring data of the monitoring station to obtain reconstructed monitoring data.
Further, step A4 includes:
and step A41, selecting a series of input samples from the time series characteristic data by adopting a sliding window technology, taking the time series characteristic data with a preset time length after a sliding window corresponding to the input samples as a prediction target, and taking the input samples and the prediction target as sample data.
Further, the water quality prediction model comprises a first network model and a second network model;
in step A5, the training process of the water quality prediction model includes:
step A51, dividing the time sequence characteristic data into a univariate time sequence and a multivariate time sequence;
step A52, the univariate time sequence is used as the input of the first network model, the multivariate time sequence is used as the input of the second network model, and the water quality prediction model is trained.
Further, the first network model is an EEMD-CNN-LSTM neural network model, and the second network model is a CNN-LSTM neural network model.
A method for predicting the quality of surface water body, which uses the water quality prediction model obtained by the construction method of the surface water body water quality prediction model, comprises the following steps:
step B1, acquiring monitoring data and space position data related to a station to be monitored, wherein the monitoring data comprises water quality data and weather data;
step B2, preprocessing the monitoring data;
step B3, determining the prediction type of the station to be monitored based on the spatial position data of the station to be monitored, reconstructing the preprocessed monitoring data according to the prediction type of the station to be monitored, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
and step B4, using the time sequence characteristic data as the input of a trained water quality prediction model, and outputting a water quality prediction result of the site to be monitored by the water quality prediction model.
Further, the method further comprises the following steps:
and B5, analyzing a water quality prediction result of the site to be monitored, and generating early warning information when the water quality prediction result meets the early warning condition.
Further, step B5 includes:
step B52, judging whether the water quality prediction result meets the fixed early warning condition:
if yes, go to step B53;
if not, go to step B54;
step B53, generating and outputting first early warning information and outputting a water quality prediction result;
step B54, judging whether the water quality prediction result meets the dynamic early warning condition:
if yes, go to step B55;
if not, go to step B56;
step B55, generating and outputting second early warning information and outputting a water quality prediction result;
and step B56, outputting a water quality prediction result.
Further, the fixed early warning conditions are: the water quality prediction result exceeds a preset fixed threshold value;
the dynamic early warning conditions include: the water quality prediction result exceeds a preset percentage of a lower limit value of the dynamic threshold corresponding to the season.
The beneficial technical effects of the invention are as follows: by using historical water quality data, introducing weather data and reconstructing monitoring data by combining spatial position data as prediction input data, modeling difficulty and modeling cost are reduced, the dependence on basic data is low, meanwhile, the prediction effect is good, intelligent prediction is realized, and the accuracy of a prediction result is improved.
Drawings
FIG. 1 is a flow chart of the steps of a method for constructing a surface water quality prediction model according to the present invention;
FIG. 2 is a flowchart of the preprocessing steps of a method for constructing a surface water quality prediction model according to the present invention;
FIG. 3 is a flow chart of the feature extraction steps of a method for constructing a surface water quality prediction model according to the present invention;
FIG. 4 is a flowchart of sample data acquisition steps of a method for constructing a surface water quality prediction model according to the present invention;
FIG. 5 is a schematic diagram of sample data obtained by sliding window technology of a method for constructing a surface water quality prediction model according to the present invention;
FIG. 6 is a flowchart of the training process steps of a method for constructing a surface water quality prediction model according to the present invention;
FIGS. 7-8 are structural frame diagrams of a water quality prediction model of a construction method of a surface water quality prediction model according to the present invention;
FIG. 9 is a flow chart of the steps of a method for predicting the quality of a body of surface water;
FIG. 10 is a flow chart of a feature extraction step of a method for predicting the quality of a body of surface water;
FIG. 11 is a flow chart of the model predictive process steps of a method for predicting the quality of a body of surface water;
FIG. 12 is a flowchart of the alarm judging process steps of a method for predicting the quality of surface water;
FIG. 13 is a flow chart of the dynamic threshold acquisition process steps of a method for predicting the quality of a body of surface water.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
The invention is further described below with reference to the drawings and specific examples, which are not intended to be limiting.
Referring to fig. 1 and 3, the invention provides a construction method of a surface water quality prediction model, which comprises the following steps:
a1, acquiring monitoring data and space position data related to a plurality of monitoring sites, wherein the monitoring data comprise water quality data and weather data;
a2, preprocessing the monitoring data;
a3, determining the prediction type of the monitoring station based on the spatial position data, reconstructing the preprocessed monitoring data according to the prediction type of the monitoring station, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
step A4, generating sample data according to the time sequence characteristic data;
and step A5, training a pre-constructed water quality prediction model by using sample data, wherein the water quality prediction model is used for predicting the surface water quality of the monitoring station.
The prediction types of the monitored stations comprise single station prediction, double station prediction and multi-station prediction. The downstream section of the river is generally formed by collecting a plurality of upstream branches, so that the downstream water quality of the river and the upstream water quality have spatial correlation, and if the upstream-downstream association relationship is adopted between monitoring stations, the upstream-downstream water quality parameters can be brought into an analysis range. The addition of the water quality characteristics of the upstream and downstream sections is beneficial to improving the prediction capability of the model. The single-station prediction refers to that the monitoring station has no corresponding upstream monitoring station and downstream monitoring station. The double-station prediction means that the monitoring station has an upstream monitoring station or a downstream monitoring station. Multi-station prediction refers to the fact that the monitoring station has both an upstream monitoring station and a downstream monitoring station.
Specifically, the water quality data in step A1 includes, but is not limited to, pH, dissolved oxygen, potassium permanganate, ammonia nitrogen, total phosphorus, total nitrogen, conductivity, turbidity, chlorophyll, algae density, and the like.
In particular, weather data includes, but is not limited to, temperature, rainfall, wind speed, barometric pressure, humidity, and the like.
Specifically, the spatial location data includes: section name, river basin name of upstream monitoring station that the monitoring station corresponds, section name, river basin name of downstream monitoring station.
Referring to fig. 2, further, the preprocessing in step A2 includes:
step A21, identifying abnormal data in the monitoring data and deleting the abnormal data;
step A22, carrying out filling processing on missing data in the monitoring data for deleting the abnormal data;
and step A23, carrying out standardization processing on the monitoring data of the patch missing data.
In the step A21, the abnormal data is identified and deleted by adopting a box graph method;
in the box-plot method:
upper limit value=q 3 +1.5*IQR;
Lower limit value=q 1 -1.5*IQR;
In the above, Q 3 For the upper quartile, Q 1 For the lower quartile, IQR is the quartile difference, Q 3 -Q 1 . Greater than the upper limit value or less than the lower limit value, it is classified as an abnormal value.
In step A22, interpolation is carried out on the missing data by adopting a linear interpolation method;
linear interpolation:
Figure BDA0004037660750000081
in the above, A (x 0 ,y 0 )、B(x 1 ,y 1 ) P (x, y) is a point to be interpolated, which is a known point.
The normalization belongs to a key step in the establishment process of the neural network model, and the original data of the independent variable and the dependent variable are scaled to a uniform distribution interval, so that all input and output data are ensured to be focused equally in the training process, and the network convergence speed is increased.
In step a23, the normalization process has the following specific mathematical formula:
Figure BDA0004037660750000091
in the above, X i For normalized data, xi is raw data, μ is the mean of data, and σ is the standard deviation of data.
Between step a22 and step a23, a step a24 is further included to remove noise data in the monitored data.
The preprocessing includes outlier deletion, outlier interpolation and normalization. The surface water body environment is a complex nonlinear system, and the water quality change is influenced by hydrological weather such as precipitation temperature and the like, is easily influenced by pollutant emission, and causes the dynamic change of physical and chemical indexes in water. And carrying out data preprocessing operation aiming at the conditions of missing, abnormality, noise and the like in the data. The abnormal data is judged and deleted by adopting a box graph method, and the missing data is subjected to deficiency by adopting a linear interpolation method. And (5) carrying out signal decomposition and reconstruction on the noise data by adopting an EEMD method.
In the step A3, feature extraction is carried out on the preprocessed monitoring data to form feature data; feature selection is one of the key steps for constructing a water quality prediction model, and the rationality of selected input items and output items determines the prediction effect and generalization performance of the final model.
Referring to fig. 3, further, the step A3 includes:
step A31, determining the prediction type of the monitoring site based on the spatial position data, and reconstructing the preprocessed monitoring data according to the prediction type of the monitoring site;
a32, carrying out correlation analysis on the reconstructed monitoring data to obtain a characteristic correlation coefficient;
and A33, reserving the characteristic that the correlation coefficient is not smaller than a preset correlation threshold value to form time sequence characteristic data.
Specifically, in step a 31:
if the prediction type of the monitoring station is single-station prediction, only taking the monitoring data of the monitoring station as reconstructed monitoring data;
if the prediction type of the monitoring station is double-station prediction, combining the monitoring data of the upstream monitoring station or the downstream monitoring station corresponding to the monitoring station with the monitoring data of the monitoring station to obtain reconstructed monitoring data;
if the prediction type of the monitoring station is multi-station prediction, combining the monitoring data of the upstream monitoring station and the downstream monitoring station corresponding to the monitoring station and the monitoring data of the monitoring station to obtain reconstructed monitoring data.
In step a31, the monitoring data of the upstream monitoring station and the downstream monitoring station are the preprocessed monitoring data in step A2.
The addition of the monitoring data upstream and downstream of the spatial position data can enable the model accuracy to be improved compared with the data mining of a single monitoring site, but if excessive input features are used, the model efficiency is reduced, so that correlation analysis is required to be performed on the reconstructed monitoring data, and the input dimension is reduced.
Correlation analysis range: correlation among water quality data, correlation between water quality data and weather data, and spatial correlation between water quality data of upstream and downstream monitoring stations.
Specifically, in step a32, a Spearman correlation coefficient method is used to perform correlation analysis on the monitored data.
The Spearman correlation coefficient indicates the correlation direction of X (independent variable) and Y (dependent variable). If Y tends to increase as X increases, the Spekerman correlation coefficient is positive; if Y tends to decrease as X increases, the Spekerman correlation coefficient is negative; a Szelman correlation coefficient of 0 indicates that Y does not have any tendency as X increases. The specific mathematical formula is as follows:
Figure BDA0004037660750000111
in the above, d i Is X i And Y is equal to i Level differences between.
In step a33, if the correlation coefficient is greater than or equal to the preset correlation threshold, the feature is retained, and if not, the feature is not retained. For example, in the process of selecting the dissolved oxygen characteristics, if the temperature and the correlation are higher, taking the temperature characteristics as a dissolved oxygen prediction input sample; if other features have a low correlation with dissolved oxygen, they do not remain.
Referring to fig. 4 and 5, further, in step A4, generating sample data according to the time series characteristic data specifically includes:
and step A41, selecting a series of input samples from the time series characteristic data by adopting a sliding window technology, taking the time series characteristic data with a preset time length after a sliding window corresponding to the input samples as a prediction target, and taking the input samples and the prediction target as sample data.
And generating an input sample and a predicted target by adopting a sliding window technology. Setting the length of the sliding window as M, selecting an input sample with the length of the sliding window as M from the time series characteristic data, and simultaneously selecting the time series characteristic data with the preset time length of N after the sliding window as a prediction target, namely predicting the water quality data of the time interval with the preset time length of N in the future through the time series characteristic data with the time length of M in the history. By continuously moving the sliding window, a series of input samples and predicted targets are generated, and sample data is formed by the input samples and the predicted targets, so that time series characteristic data can be fully utilized.
Preferably, the single step prediction time window setting: the length m=18 of the sliding window, the predetermined time length n=1;
preferably, the multi-step prediction time window setting: the length of the sliding window m=42 and the predetermined time length n=6.
Further, in step A4, step a41 further includes:
step a42, dividing the sample data into a training set, a test set and a verification set.
Specifically, the division ratio of the training set, the test set, and the validation set is set to p1:p2:p3, where p1+p2+p3=1 and p1> (p2+p3).
Step A5, training a pre-constructed water quality prediction model by using sample data, wherein the water quality prediction model is used for predicting water quality data of a monitoring station;
specifically, the framework of the water quality prediction model is shown in fig. 7-8, and further, the water quality prediction model comprises a first network model and a second network model.
Referring to fig. 6, in step A5, the training process of the water quality prediction model includes:
step A51, dividing the time sequence characteristic data into a univariate time sequence and a multivariate time sequence;
step A52, a univariate time sequence is used as the input of a first network model, a multivariate time sequence is used as the input of a second network model, and a water quality prediction model is trained;
specifically, in step a51, the number of variables included in the time-series characteristic data is used as a division basis. If the variable feature quantity is 1, the time series is single variable, and if the variable feature quantity is more than 1, the time series is multi-variable.
Specifically, in step A52, the first network model is an EEMD-CNN-LSTM neural network model.
Specifically, in step A52, the second network model is a CNN-LSTM neural network model.
And taking the output result of the first network model and the output result of the second network model as water quality prediction results.
The change of the surface water quality data has certain periodicity, but is influenced by a plurality of factors to show complex nonlinear trend, and high requirements are put on a water quality prediction model. The convolutional neural network CNN can perform convolutional operation on each time sequence, well extracts local features in water quality data, is insensitive to time sequence, and cannot complete prediction tasks well independently. And CNN is combined with LSTM of long-term memory neural network, so that CNN characteristic information extraction capability and LSTM time sequence sensitivity can be fully utilized to complete complex water quality prediction.
The integrated empirical mode EEMD (Ensemble Empirical Mode Decomposition) can decompose a complex signal into a limited number of linear combinations of eigenmode functions (intrinsic modefunctions, IMF) with frequencies from high to low, and each decomposed IMF component contains a local characteristic signal of the original signal, so as to finish the signal stabilization processing.
EEMD-CNN-LSTM operation mechanism: the method comprises the steps of firstly decomposing a time sequence into a plurality of IMF components and Res components, then respectively taking the IMF components and the Res components as inputs of CNN-LSTM neural network models, and finally summing prediction results of the CNN-LSTM neural network models to obtain outputs of EEMD-CNN-LSTM neural network models.
Further, the parameters of each CNN-LSTM neural network are set as follows: the convolution kernel size is 3, the hidden layer number is 2, the hidden layer neuron number is 128, the activation function is ReLU, the regularization function is L2, the learning rate is 0.001, the training round is 100, the training batch data size is 64, and the optimizer is adam.
Specifically, in step a52, the water quality prediction model is trained using the time-series feature data divided in the training set.
Further, step a52 further includes:
step A53, performing super-parameter adjustment on the water quality prediction model by using the time sequence characteristic data after the division in the verification set;
and step A54, performing model evaluation on the water quality prediction model by using the time series characteristic data after the division in the test set.
In step a54, the indexes of the model evaluation include Mean Absolute Error (MAE), root Mean Square Error (RMSE), mean absolute error percentage (MAPE) and determination coefficient (R2), and the calculation formula is as follows.
Figure BDA0004037660750000141
Figure BDA0004037660750000142
Figure BDA0004037660750000143
/>
Figure BDA0004037660750000144
Wherein: n is the number of sample data, y i For the actual value at the i-th moment i.e. the actual value in the predicted target,
Figure BDA0004037660750000145
for the i-th instant predicted value, i.e. the predicted value of the water quality prediction model,/for the water quality prediction model>
Figure BDA0004037660750000146
Is y i Average value of (2).
Referring to fig. 9-10, the present invention further provides a method for predicting the quality of surface water, which is characterized in that the trained water quality prediction model obtained by using the method for constructing the surface water quality prediction model as described above includes:
step B1, acquiring monitoring data and space position data related to a station to be monitored, wherein the monitoring data comprises water quality data and weather data;
step B2, preprocessing the monitoring data;
step B3, determining the prediction type of the station to be monitored based on the spatial position data of the station to be monitored, reconstructing the preprocessed monitoring data according to the prediction type of the station to be monitored, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
and step B4, using the time sequence characteristic data as the input of a trained water quality prediction model, and outputting a water quality prediction result of the site to be monitored by the water quality prediction model.
The prediction type of the station to be monitored comprises single station prediction, double station prediction and multi-station prediction. The single station prediction refers to that the station to be monitored has no corresponding upstream monitoring station and downstream monitoring station. The double-station prediction refers to that a station to be monitored has an upstream monitoring station or a downstream monitoring station. Multi-station prediction refers to the fact that the station to be monitored has both an upstream monitoring station and a downstream monitoring station.
Specifically, the water quality data in step B1 includes, but is not limited to, pH, dissolved oxygen, potassium permanganate, ammonia nitrogen, total phosphorus, total nitrogen, conductivity, turbidity, chlorophyll, algae density, and the like.
In particular, weather data includes, but is not limited to, temperature, rainfall, wind speed, barometric pressure, humidity, and the like.
Specifically, the spatial location data includes: the section name, river name and river basin name of the upstream monitoring station corresponding to the station to be monitored.
Further, in step B2, preprocessing the monitoring data includes: deleting abnormal data, supplementing missing data and normalizing. In addition, the preprocessing also includes removing noise data. The specific process is the preprocessing process of the monitoring data in the step A2.
Referring to fig. 10, further, after step B31, the method further includes:
step B31, determining the prediction type of the station to be monitored based on the spatial position data of the station to be monitored, and reconstructing the preprocessed monitoring data according to the prediction type of the station to be monitored;
step B32, carrying out correlation analysis on the reconstructed monitoring data to obtain a characteristic correlation coefficient;
and B33, reserving the characteristic that the correlation coefficient is not smaller than a preset correlation threshold value to form time sequence characteristic data of the station to be monitored.
Further, in step B31,
if the prediction type of the station to be monitored is single-station prediction, only taking the monitoring data of the station to be monitored as reconstructed monitoring data;
if the prediction type of the station to be monitored is double-station prediction, taking the monitoring data of the upstream monitoring station or the downstream monitoring station corresponding to the station to be monitored and the monitoring data of the station to be monitored as reconstructed monitoring data;
if the prediction type of the station to be monitored is multi-station prediction, the monitoring data of the upstream monitoring station and the downstream monitoring station corresponding to the station to be monitored and the monitoring data of the monitoring station are used as reconstructed monitoring data.
In step B31, the monitoring data of the upstream monitoring station and the downstream monitoring station are also the monitoring data after the preprocessing in step B2.
Further, correlation analysis range: correlation among water quality data, correlation between water quality data and weather data, and spatial correlation between water quality data of upstream and downstream monitoring stations. Specific analysis procedures are as described in the foregoing steps a32 and a 33.
Referring to fig. 11, further, in step B4, the time-series characteristic data is used as an input of a trained water quality prediction model, and the water quality prediction model outputs a water quality prediction result of the site to be monitored;
step B4 includes:
step B41, dividing the time sequence characteristic data into a univariate time sequence and a multivariate time sequence;
and step B42, taking the univariate time sequence as the input of the first network model, taking the multivariate time sequence as the input of the second network model, and taking the output of the first network model and the output of the second network model as the water quality prediction result of the site to be monitored.
Further, step B4 further includes:
and B5, analyzing a water quality prediction result of the site to be monitored, and generating early warning information when the water quality prediction result meets the early warning condition.
Referring to fig. 12, further, step B5 includes:
step B51, determining the water quality grade of a station to be monitored based on a water quality prediction result and a preset surface water quality standard;
step B52, judging whether the water quality prediction result meets the fixed early warning condition:
if yes, go to step B53;
if not, go to step B54;
step B53, generating and outputting first early warning information, and outputting a water quality prediction result and a water quality grade;
step B54, judging whether the water quality prediction result meets the fixed early warning condition:
if yes, go to step B55;
if not, go to step B56;
step B55, generating and outputting second early warning information, and outputting a water quality prediction result and a water quality grade;
and step B56, outputting a water quality prediction result and a water quality grade.
Specifically, in step B51, the surface water quality standard is the surface water quality standard (GB 3838-2002).
Wherein, fixed early warning condition is: the water quality prediction result exceeds a preset fixed threshold. And comparing the water quality prediction result with a fixed threshold value, and if the water quality prediction result is larger than the fixed threshold value, sending out first warning information of overrun.
Wherein, the dynamic early warning condition is: a preset percentage of the lower limit value of the dynamic threshold corresponding to the season is exceeded.
Referring to fig. 13, further, in step B54, the process of forming the dynamic threshold includes:
step B541, acquiring water quality data of the station to be monitored in the last years;
step B542, dividing the water quality data in the last years according to seasons to obtain the water quality data of each season;
step B543, respectively calculating the mean value and the variance of the water quality data in each season;
step B544, determining the dynamic threshold value of each season according to the mean value and the variance of the water quality data in each season.
Specifically, in step B541, water quality data of the station to be monitored in the last three years is obtained;
specifically, in step B542, the water quality data in the last years is divided into spring water quality data, summer water quality data, autumn water quality data, and winter water quality data. Specifically, in step B543, the mean and variance of the spring water quality data, the mean and variance of the summer water quality data, the mean and variance of the autumn water quality data, and the mean and variance of the winter water quality data are calculated, respectively. The calculation formula is as follows:
Figure BDA0004037660750000191
Figure BDA0004037660750000192
wherein x is i The value of the water quality data for each season is represented, and m represents the number of water quality data for each season.
Specifically, in step B544, the dynamic threshold value of each season is obtained according to the rada criterion.
Wherein the upper limit of the dynamic threshold is the mean +3×variance;
wherein the lower limit of the dynamic threshold is the mean-3 x variance.
And if the preset percentage of the lower limit value of the dynamic threshold value is exceeded, sending out second early warning information close to the lower limit value. The value range of the preset percentage is 80% -90%.
Considering that the water quality change is affected by seasons, in order to avoid the problem of unreasonable early warning caused by the fixed threshold value set by the traditional method, a mode of combining the fixed threshold value with the dynamic threshold value is adopted, wherein the fixed threshold value is set according to national or local standards, and the early warning grade is highest; the dynamic threshold value is set according to the season segment, the value of the dynamic threshold value cannot exceed national or local standards, and the early warning is carried out for a plurality of times.
The invention reduces modeling difficulty and modeling cost by a construction method and a prediction method of the surface water quality prediction model, has low dependence on basic data, has good prediction effect, can realize intelligent prediction based on historical data prediction,
according to the method and the device for constructing the surface water quality prediction model, the large data model, namely the convolution and circulation neural network model is adopted to construct the surface water quality prediction model, single-step and multi-step prediction of the future water body change trend can be realized, the future development condition and the change trend of the surface water body can be predicted, corresponding warning information is given according to a set threshold value, effective prevention of water quality pollution accidents is realized, and technical support is provided for top-level design for overall coordination, water resource system protection and water quality safety of water plants. Meanwhile, the method has important significance for enhancing the water resource environmental protection, improving the current situation of water pollution control and reducing the water pollution accident and promoting ecological environment restoration.
The foregoing is merely illustrative of the preferred embodiments of the present invention and is not intended to limit the embodiments and scope of the present invention, and it should be appreciated by those skilled in the art that equivalent substitutions and obvious variations may be made using the description and illustrations of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. The construction method of the surface water quality prediction model is characterized by comprising the following steps:
a1, acquiring monitoring data and space position data related to a plurality of monitoring sites, wherein the monitoring 5 data comprise water quality data;
a2, preprocessing the monitoring data;
a3, determining the prediction type of the monitoring station based on the spatial position data, reconstructing the preprocessed monitoring data according to the prediction type of the monitoring station, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
step 0, A4, generating sample data according to the time sequence characteristic data;
and step A5, training a pre-constructed water quality prediction model by using the sample data, wherein the water quality prediction model is used for predicting the surface water quality of the monitoring station.
2. The method for constructing a surface water quality prediction model according to claim 1, wherein the step A3 includes:
step A31, determining the prediction type of the monitoring station based on the spatial position data, and reconstructing the preprocessed monitoring data according to the prediction type of the monitoring station;
a32, carrying out correlation analysis on the reconstructed monitoring data to obtain a characteristic correlation coefficient;
and A33, reserving the characteristic that the correlation coefficient is not smaller than a preset correlation threshold value to form 0 time sequence characteristic data.
3. The method for constructing a surface water quality prediction model according to claim 2, wherein the prediction types of the monitoring sites include single-site prediction, double-site prediction and multi-site prediction, and the step a31 includes:
if the prediction type of the monitoring station is single-station prediction, only using the monitoring data of the monitoring station as reconstructed monitoring data;
if the prediction type of the monitoring station is double-station prediction, combining the monitoring data of the upstream monitoring station or the downstream monitoring station corresponding to the monitoring station with the monitoring data of the monitoring station to obtain reconstructed monitoring data;
if the prediction type of the monitoring station is multi-station prediction, combining the monitoring data of the upstream monitoring station and the downstream monitoring station corresponding to the monitoring station and the monitoring data of the monitoring station to obtain reconstructed monitoring data.
4. The method for constructing a surface water quality prediction model according to claim 1, wherein the step A4 includes:
and step A41, selecting a series of input samples from the time series characteristic data by adopting a sliding window technology, taking the time series characteristic data with a preset time length after a sliding window corresponding to the input samples as a prediction target, and taking the input samples and the prediction target as sample data.
5. The method for constructing a surface water quality prediction model as claimed in claim 1, wherein the water quality prediction model comprises a first network model and a second network model;
in the step A5, the training process of the water quality prediction model includes:
step A51, dividing the time series characteristic data into a univariate time series and a multivariate time series;
step A52, the univariate time series is used as the input of a first network model, the multivariate time series is used as the input of a second network model, and the water quality prediction model is trained.
6. The method of claim 5, wherein the first network model is an EEMD-CNN-LSTM neural network model and the second network model is a CNN-LSTM neural network model.
7. A method for predicting the quality of a surface water body, characterized by using a water quality prediction model obtained by the method for constructing a surface water body water quality prediction model according to any one of claims 1 to 6, comprising:
step B1, acquiring monitoring data and space position data related to a station to be monitored, wherein the monitoring data comprises water quality data;
step B2, preprocessing the monitoring data;
step B3, determining the prediction type of the station to be monitored based on the spatial position data of the station to be monitored, reconstructing the preprocessed monitoring data according to the prediction type of the station to be monitored, and then performing feature extraction on the reconstructed monitoring data to form time sequence feature data;
and B4, using the time sequence characteristic data as the input of the trained water quality prediction model, and outputting a water quality prediction result of the site to be monitored by the water quality prediction model.
8. The method for predicting the quality of a body of surface water of claim 7, further comprising:
and B5, analyzing the water quality prediction result of the station to be monitored, and generating early warning information when the water quality prediction result meets the early warning condition.
9. The method for predicting the quality of a body of surface water of claim 8, wherein said step B5 comprises:
step B52, judging whether the water quality prediction result meets a fixed early warning condition or not:
if yes, go to step B53;
if not, go to step B54;
step B53, generating and outputting first early warning information and outputting the water quality prediction result;
and B54, judging whether the water quality prediction result meets a dynamic early warning condition or not:
if yes, go to step B55;
if not, go to step B56;
step B55, generating and outputting second early warning information and outputting the water quality prediction result;
and step B56, outputting the water quality prediction result.
10. The method for predicting the quality of a body of surface water of claim 9, wherein the fixed pre-warning conditions are: the water quality prediction result exceeds a preset fixed threshold value;
the dynamic early warning conditions include: the water quality prediction result exceeds a preset percentage of a lower limit value of a dynamic threshold corresponding to a season.
CN202310007638.7A 2023-01-04 2023-01-04 Construction method and prediction method of surface water quality prediction model Pending CN116128039A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310007638.7A CN116128039A (en) 2023-01-04 2023-01-04 Construction method and prediction method of surface water quality prediction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310007638.7A CN116128039A (en) 2023-01-04 2023-01-04 Construction method and prediction method of surface water quality prediction model

Publications (1)

Publication Number Publication Date
CN116128039A true CN116128039A (en) 2023-05-16

Family

ID=86295083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310007638.7A Pending CN116128039A (en) 2023-01-04 2023-01-04 Construction method and prediction method of surface water quality prediction model

Country Status (1)

Country Link
CN (1) CN116128039A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117609792A (en) * 2024-01-18 2024-02-27 北京英视睿达科技股份有限公司 Water quality prediction model training method
CN117686447A (en) * 2024-01-31 2024-03-12 北京英视睿达科技股份有限公司 Water quality monitoring method, device, equipment and medium based on multichannel model
CN117609792B (en) * 2024-01-18 2024-06-11 北京英视睿达科技股份有限公司 Water quality prediction model training method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117609792A (en) * 2024-01-18 2024-02-27 北京英视睿达科技股份有限公司 Water quality prediction model training method
CN117609792B (en) * 2024-01-18 2024-06-11 北京英视睿达科技股份有限公司 Water quality prediction model training method
CN117686447A (en) * 2024-01-31 2024-03-12 北京英视睿达科技股份有限公司 Water quality monitoring method, device, equipment and medium based on multichannel model
CN117686447B (en) * 2024-01-31 2024-05-03 北京英视睿达科技股份有限公司 Water quality monitoring method, device, equipment and medium based on multichannel model

Similar Documents

Publication Publication Date Title
Jebli et al. Prediction of solar energy guided by pearson correlation using machine learning
Cheng et al. Forecasting of wastewater treatment plant key features using deep learning-based models: A case study
CN111291937A (en) Method for predicting quality of treated sewage based on combination of support vector classification and GRU neural network
Xu et al. Mid-term prediction of electrical energy consumption for crude oil pipelines using a hybrid algorithm of support vector machine and genetic algorithm
Zheng et al. Prediction of harmful algal blooms in large water bodies using the combined EFDC and LSTM models
Li et al. Concentration estimation of dissolved oxygen in Pearl River Basin using input variable selection and machine learning techniques
CN112132333B (en) Short-term water quality and quantity prediction method and system based on deep learning
CN110516844A (en) Multivariable based on EMD-PCA-LSTM inputs photovoltaic power forecasting method
CN111210128B (en) Wetland early warning method based on artificial intelligence and random self-adaptive threshold
Zhang et al. Turbidity prediction of lake-type raw water using random forest model based on meteorological data: A case study of Tai lake, China
CN114358213B (en) Error ablation processing method, system and medium for nonlinear time series data prediction
CN113496314B (en) Method for predicting road traffic flow by neural network model
CN111652425A (en) River water quality prediction method based on rough set and long and short term memory network
CN116128039A (en) Construction method and prediction method of surface water quality prediction model
CN114023399A (en) Air particulate matter analysis early warning method and device based on artificial intelligence
CN115222106A (en) User day-ahead load prediction method of self-adaptive model
Cui et al. Deep learning methods for atmospheric PM2. 5 prediction: A comparative study of transformer and CNN-LSTM-attention
KR20230086850A (en) Data­Driven Hybrid Model for Forecasting Wastewater Infuent Loads Based on Multimodal and Ensemble Deep Learning
Jiang et al. Prediction of sea temperature using temporal convolutional network and LSTM-GRU network
Xun et al. Photovoltaic power forecasting method based on adaptive classification strategy and HO-SVR algorithm
CN116151464A (en) Photovoltaic power generation power prediction method, system and storable medium
CN112070303B (en) Parameter-adaptive photovoltaic power ramp event hierarchical probabilistic prediction method
Kang et al. Research on forecasting method for effluent ammonia nitrogen concentration based on GRA-TCN
Wu et al. Combined IXGBoost-KELM short-term photovoltaic power prediction model based on multidimensional similar day clustering and dual decomposition
Zhang et al. A Multi-Model Prediction Method for Coal Mine Gas Concentration with Hierarchical Structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination