CN113408799A - River total nitrogen concentration prediction method based on hybrid neural network - Google Patents

River total nitrogen concentration prediction method based on hybrid neural network Download PDF

Info

Publication number
CN113408799A
CN113408799A CN202110669073.XA CN202110669073A CN113408799A CN 113408799 A CN113408799 A CN 113408799A CN 202110669073 A CN202110669073 A CN 202110669073A CN 113408799 A CN113408799 A CN 113408799A
Authority
CN
China
Prior art keywords
water quality
river
layer
predicted
total nitrogen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110669073.XA
Other languages
Chinese (zh)
Inventor
闫健卓
刘佳雪
于涌川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110669073.XA priority Critical patent/CN113408799A/en
Publication of CN113408799A publication Critical patent/CN113408799A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a river total nitrogen concentration prediction method based on a hybrid neural network. Cleaning water quality data of the river to be predicted by adopting an isolated forest and a Langerita method; secondly, extracting nonlinear local features from the cleaned water quality data to be predicted by utilizing a one-dimensional convolution residual error neural network; and then integrating the information of the time sequence before and after by using a bidirectional gating circulation unit, and obtaining a final river total nitrogen concentration prediction result by using a full-junction layer at the top layer. And finally, using the evaluation index as an evaluation parameter of the model. The present invention, according to an embodiment, completes a set of experiments to compare the proposed method with a traditional, single neural network water quality parameter prediction method. Experimental results show that the water quality parameter prediction model has good representation in the aspects of stability and generalization capability, and prediction errors are effectively reduced. The method realizes organic integration of the feature extraction module and the bidirectional circulation prediction module for the first time, and is applied to the field of water quality prediction.

Description

River total nitrogen concentration prediction method based on hybrid neural network
Technical Field
The invention belongs to the field of computer science calculation, and relates to a river total nitrogen concentration prediction method based on a mixed neural network.
Technical Field
With the rapid development of economy and the rapid progress of science and technology in China, the production and living range of people is more and more extensive, and the domestic sewage, chemical fertilizers, food and other industrial wastewater and farmland drainage contain a large amount of nitrogen, phosphorus and other inorganic salts, so that the nutrient substances of rivers are greatly increased, and the water environment of the rivers is likely to generate the phenomenon of water eutrophication. The substance of the water eutrophication is that the input and output of nutritive salt are unbalanced, so that the distribution of water ecological substances is unbalanced, a single species is excessively swelled, the substance and energy flow of the system are damaged, and the whole ecological system gradually goes to death. River water quality deterioration has profound effects on the ecological health of surface water and its tributaries, which undoubtedly increases the burden of sustainable development of drinking water for humans. In the water environment treatment stage, the real-time prediction of the water quality can provide scientific basis for the protection and treatment of the water environment, and the construction of an accurate and effective water quality parameter prediction model is a crucial link for improving the water quality of rivers.
Most data-driven models have remarkable effect on water quality parameter prediction, and the main methods are a time sequence method, a grey theory prediction method, a regression prediction method (such as a support vector machine) and an artificial neural network prediction method. However, the first three methods have the defects of poor generalization ability, low prediction accuracy and the like. In recent years, deep learning methods have received increasing attention in water quality modeling. The artificial neural network is a machine learning technology for simulating a biological nervous system by widely and parallelly interconnecting adaptive simple units, is the basis of deep learning, and has the advantages of good robustness, capability of fully fitting complex nonlinear relations and the like. Therefore, the embodiment of the invention provides a river total nitrogen concentration prediction method based on a hybrid neural network, aiming at improving the stability and generalization capability of a model and reducing the prediction error of the total nitrogen concentration.
Disclosure of Invention
The invention aims to provide a river total nitrogen concentration prediction method based on a mixed neural network, and the method is used for predicting water quality parameters total nitrogen. In order to solve the problem that the existing and traditional single water quality prediction algorithm cannot mine the local characteristics of water quality and improve the prediction precision and efficiency, the overall network model mainly comprises two parts: the device comprises a feature learning module and a prediction generation module. The local characteristics of the water quality data are extracted by utilizing the one-dimensional convolution residual neural network, and the bidirectional gating circulation unit serving as a prediction module can integrate the information of the time sequence before and after and obtain the final prediction result of the water quality parameters. In order to improve the integrity of the data, before model training, the water quality data of the river is cleaned and corrected by adopting an isolation forest and a Langery interpolation method. The data set used for model training and testing comes from the real data set of \28390river. The method realizes the fusion of the neural network, and provides a brand-new deep learning water quality prediction method so as to improve the prediction precision of the water quality parameters.
In order to solve the problems, the invention adopts the following technical scheme:
a river total nitrogen concentration prediction method based on a hybrid neural network mainly comprises the following steps:
step 1, collecting 28390river section water quality data once every four hours through a collecting module of the Internet of things equipment, and cleaning an original water quality data set of a water quality area to be predicted.
And 2, dividing river water quality data of the area to be predicted into a training set and a testing set. Constructing a model, wherein the model mainly comprises two modules: the device comprises a feature learning module and a prediction generation module. The feature learning module uses 1-DRCNN, and the prediction generation module uses BiGRU and a full connection layer.
And 3, training the model constructed in the step 2 by using an Adam optimization algorithm, obtaining an optimal water quality parameter prediction model after the training is finished, and predicting the total nitrogen concentration of the river to be predicted in the test set.
And 4, evaluating the performance of the model by using model evaluation indexes including average absolute error, average absolute percentage error, root mean square error and decision coefficient.
According to the river total nitrogen concentration prediction method based on the hybrid neural network, the water quality data comprise 9 water quality parameters: temperature (T, DEG C.), PH, dissolved oxygen (COD, mg/L), biochemical oxygen demand (BOD, mg/L), turbidity (NTU), potassium permanganate (COD-Mn, mg/L), ammonia nitrogen (NH3-N, mg/L), total phosphorus (TP, mg/L), total nitrogen (TN, mg/L).
Preferably, step 1 specifically comprises the following steps:
and performing data cleaning on the original water quality data set, specifically comprising detecting abnormal data in the original water quality data set based on an isolation forest and completing vacancy values in the original water quality data based on a Langery method.
Preferably, step 2 specifically comprises the following steps:
step 2.1, randomly dividing a water quality data set of a river area to be tested into a training set and a testing set;
and 2.2, inputting the training set serving as the current water quality parameter into the 1-DRCNN network. 1-DRCNN can excavate and extract potential nonlinear relation characteristics among the current water quality parameters to form effective low-dimensional characteristics;
step 2.3, inputting the 1-DRCNN output as a BiGRU model, learning and integrating the dependency of time water quality parameters before and after, and further optimizing the water quality data expressed by characteristics;
and 2.4, at the top layer of the model, taking the output of the BiGRU as the input of a full connection layer, wherein the full connection layer is used for generating a predicted value of the water quality parameter.
Preferably, step 3 specifically comprises the following steps:
and 3.1, in the model training process, using an Adma optimizer to adjust the weight and the deviation of the model, and continuously calculating the deviation between the total nitrogen concentration of the river water output by the full-connection layer and the real total nitrogen concentration in the training set in the training process. And when the deviation is smaller than a self-determined threshold value, finishing the model training to obtain an optimal model.
And 3.2, acquiring a total nitrogen concentration result of the river to be predicted on the test set by using the trained optimal model.
Compared with the prior art, the invention has the following advantages:
the invention not only realizes the function of predicting the river concentration, but also cleans the water quality data of the river to be detected before prediction, and enhances the integrity of the water quality data of the river. The invention utilizes the 1-DRCNN to solve the problem that the result can reach saturation and rapidly decrease along with the deepening of the network in the model training process of the traditional long and deep CNN network, thereby generating gradient explosion and degradation, and fully utilizes the good characteristic extraction and expression capability of the 1-DRCNN network to learn the complex and nonlinear local characteristics of the river water quality data and reduce the latitude of the river water quality data. In addition, the BiGRU in the prediction generation module enables the model to capture the context time dependency of the river water quality data in the training process, and the model prediction precision of the river total nitrogen concentration is effectively improved. In general, the method can predict the total nitrogen concentration of the river efficiently and accurately.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a 1-DRCNN-BiGRU model in a river total nitrogen concentration prediction method based on a hybrid neural network according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a BiGRU model in a river total nitrogen concentration prediction method based on a hybrid neural network according to an embodiment of the present invention.
Fig. 3 is a complete algorithm flowchart of a method for predicting total nitrogen concentration of a river based on a hybrid neural network according to an embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The method for predicting the total nitrogen concentration in river water according to the embodiment of the invention is described below with reference to fig. 1, and specifically includes the following steps:
step 1, collecting 28390once every four hours through a data collecting module of the Internet of things equipment, wherein a plurality of water quality parameters collected at each moment form a vector, the vector is used as a sample, and an original water quality data set of a water quality area to be predicted is cleaned.
Due to the factors of failure of the acquisition equipment or artificial error recording and the like, abnormal values and vacancy values inevitably occur in the water quality data, and the prediction accuracy of the model is low due to the data which are not in accordance with the standard. Specifically, the embodiment improves the integrity of the river water quality data to be predicted by performing data cleaning on the original water quality data set of the river to be predicted; and the data cleaning comprises detecting abnormal data in the original water quality data set based on forest isolation and completing vacancy values in the original water quality data based on a Langery method.
The embodiment can detect abnormal values of the river water quality data to be predicted collected by the equipment by using the isolated forest. The abnormal value detection of the water quality data to be predicted by using the soliton is mainly divided into two steps:
firstly, constructing an isolated forest. And (4) recursively dividing the water quality data set to be predicted without considering the distance or the density of two samples of the river water quality data set to be predicted, and constructing an isolated tree until all sample points are isolated.
Secondly, calculating an abnormality score, wherein the formula is as follows:
Figure BDA0003118338060000041
Figure BDA0003118338060000042
wherein n represents the size of the water quality sample data x to be predicted, and H (n-1) represents a harmonic function which can be estimated by ln (n-1) + 0.5772156649. And C (n) represents the average path length of a binary tree constructed by the water quality sample data to be predicted of the n samples, and is mainly used for normalization. E (h (x)) represents the mean of the path lengths of the sample data x to be predicted in the plurality of itrees. If the node score is closer to 1, the more likely the sample node is abnormal; if the score of the sample node is closer to 0, the node is normal.
On the basis of the above embodiment, after abnormal value detection is performed on the water quality data to be predicted, the water quality data to be predicted is set as a null value. It is then padded using a langerand-based method. And finding a polynomial function which just passes through a two-dimensional plane formed by the non-vacant m water quality data sample points to be predicted according to the non-vacant m water quality data sample points to be predicted, wherein the observed value of the plane is just obtained at each observation point. And realizing the filled vacancy value according to the polynomial function. The polynomial function formula of the lagrange interpolation method is thus as follows:
Lm(x)=y0l0(x)+y1l1(x)+…+ymlm
wherein, ymRepresenting the true value, l, of the corresponding sample point in the river water quality dataset to be predictedi(x) Representing an interpolation function corresponding to a sample point in a river water quality data set to be predicted:
Figure BDA0003118338060000051
and 2, on the basis of the embodiment, dividing river water quality data of the area to be predicted into a training set and a testing set. As shown in the structural diagram of the overall model shown in FIG. 1, the river total nitrogen concentration prediction model is constructed and mainly comprises two modules, a feature learning module and a prediction generation module. The feature learning module uses 1-DRCNN, and the prediction generation module uses BiGRU and a full connection layer.
On the basis of the above embodiment, a plurality of water quality parameters collected at each moment constitute a vector as an input sample of the model. Each vector in the input sample consists of eight water quality parameters of temperature (T), PH, dissolved oxygen (COD,), Biochemical Oxygen Demand (BOD), turbidity (NTU), potassium permanganate (COD-Mn), ammonia nitrogen (NH3-N) and Total Phosphorus (TP) at each moment, and Total Nitrogen (TN) is used as an output characteristic prediction vector of the model and is used for comparing with the real total nitrogen concentration and evaluating the prediction accuracy of the model.
Step 2.1, randomly dividing a water quality data set of a river area to be tested into a training set and a testing set;
and 2.2, constructing a characteristic learning module 1-DRCNN network, and inputting the training set serving as the current water quality parameter into the 1-DRCNN network. The 1-DRCNN can be used for mining and extracting potential nonlinear relation features among the current water quality parameters to form effective low-dimensional features. The invention uses two one-dimensional convolution residual blocks (1-DConv _ block) as the main structure of a feature learning module, each residual block mainly comprises 3 convolution layers of 1x1, 3 batch normalization layers (BN layer) and three activation functions (Selu), wherein the sizes of the filter and the convolution kernel of each convolution layer are set to be 32.
On the basis of the above embodiment, before each 1 × 1 convolutional layer is input, a Batch Normalization layer (BN layer) needs to be used to transform and reconstruct the output of the upper layer, the BN calculates the first-order and second-order statistics of each Batch, the intermediate output of the one-dimensional convolutional neural network evolution layer is continuously adjusted, the higher layer network continuously adapts to the parameter update of the lower layer network, the output of each layer network tends to be stable, and the problem that the convergence rate of the network is reduced due to the disappearance of the gradient of the higher layer network is solved. Example a as above (a)1,…,ad) The reconstruction formula is as follows:
Figure BDA0003118338060000061
to pair
Figure BDA0003118338060000062
And (3) carrying out reconstruction:
Figure BDA0003118338060000063
wherein,
Figure BDA0003118338060000064
represents the variance, β, of the sample(k)=E[a(k)]Indicating the expectation of the sample.
In addition, the convolution layer of 1 × 1 performs convolution operations using convolution kernels having the same weight and different regions in the training set of the river water quality data input after reconstruction in the above-described embodiment, learns information of all positions in the training set of the river water quality data in the above-described embodiment, and generates a low-dimensional feature vector. Output of each convolutional layer
Figure BDA0003118338060000065
Comprises the following steps:
Figure BDA0003118338060000066
Figure BDA0003118338060000067
f(x)=SELU=βmax(αezx-α,x)
in the formula,
Figure BDA0003118338060000068
is the input to the i-th neuron in the l-th layer network,
Figure BDA0003118338060000069
is the output of the ith neuron in the l-1 th layer network,
Figure BDA00031183380600000610
is a filter between the kth neuron of the l-1 th layer network and the ith neuron of the l-1 th layer network,
Figure BDA00031183380600000611
defined as the bias of the kth neuron in the l-th layer of the network. In order to improve the feature expression capability of the convolutional layer, the nonlinear feature mapping of the convolutional layer is realized by adopting an activation function f (). In the formula, α and β are fixed values, and β is 1.05070098 and α is 1.67326324, respectively.
Step 2.3, the 1-DRCNN output is used as a BiGRU model input, the BiGRU network autonomously captures the dependence relationship of short time sequence data, long time data and context attributes, a hidden state combining forward and backward is obtained, the dependence of time water quality parameters before and after learning and integration is learned, and the water quality data expressed by characteristics is further optimized, the structure of the water quality data is shown in FIG. 3, and the characteristic modeling process based on the BiGRU is described as follows:
Figure BDA00031183380600000612
Figure BDA00031183380600000613
Figure BDA00031183380600000614
where the GRU (.) function represents the nonlinear transformation of the above embodiment using the GRU network to encode the input vector into the corresponding GRU hidden state. w is at、vtRespectively representing the forward hidden layer state of the bidirectional BiGUR at the time t
Figure BDA00031183380600000615
And reverse hidden layer states
Figure BDA00031183380600000616
And (4) weighting. btBiasing to indicate the state of the hidden layer at time t。
And 2.4, at the top layer of the model, taking the output of the BiGRU as the input of a full connection layer, wherein the full connection layer is used for generating a total nitrogen concentration predicted value in the embodiment.
And 3, training the model constructed in the step 3 by using an Adam optimization algorithm, obtaining an optimal water quality parameter prediction model after training is finished, and then predicting the total nitrogen concentration in a test set.
And 3.1, in the model training process, using an optimizer Adma to adjust the weight and the deviation of the model, and calculating the error between the total nitrogen concentration of the river water quality output by the full-connection layer and the real concentration. And when the error is smaller than a preset threshold value, finishing the model training to obtain an optimal model.
And 3.2, predicting the total nitrogen concentration on the test set by using the trained optimal model to obtain a prediction result of the total nitrogen concentration.
Step 4, after obtaining the result of the river total nitrogen concentration to be predicted, in order to evaluate the accuracy and effectiveness of the model for predicting the river total nitrogen concentration to be predicted, the method adopts the average absolute error (MAE), the average absolute percentage error (MAPE), the Root Mean Square Error (RMSE) and the decision coefficient (R)2) And evaluating the prediction effect of the model. The formula of the merit function is as follows:
Figure BDA0003118338060000071
Figure BDA0003118338060000072
Figure BDA0003118338060000073
Figure BDA0003118338060000074
wherein N isSize of sample set, yiIs a predicted value of the total nitrogen concentration of the model, YiIs the observed value (true value) of the ith total nitrogen concentration in the test set,
Figure BDA0003118338060000075
is the average of observed values of total nitrogen concentration in the test set. If the result of MAE, MAPE, RMSE is closer to 0, R2The closer to 1, the higher the prediction accuracy of the constructed model.
As previously mentioned, the advantages of the present invention are:
1. the problem that a single water quality prediction algorithm cannot mine local characteristics of water quality and learn time dependence of time sequence data is solved, and the prediction accuracy and efficiency of the river total nitrogen concentration are improved.
2. The model realizes organic integration of the feature extraction module and the bidirectional circulation prediction module for the first time, is applied to the field of water quality prediction, and effectively improves the stability and generalization capability of the water quality parameter prediction model.
The above embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and the scope of the present invention is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present invention, and such modifications and equivalents should also be considered as falling within the scope of the present invention.

Claims (6)

1. A river total nitrogen concentration prediction method based on a hybrid neural network is characterized by comprising the following steps:
step 1: collecting 28390river section water quality data once every four hours through a collecting module of the Internet of things equipment, and cleaning an original water quality data set of a river area to be predicted;
step 2: dividing river water quality data of an area to be predicted into a training set and a testing set; constructing a model, wherein the model mainly comprises two modules: the device comprises a feature learning module and a prediction generation module; the feature learning module uses 1-DRCNN, and the prediction generation module uses BiGRU and a full connection layer;
and step 3: training the model constructed in the step 2 by using an Adam optimization algorithm, obtaining an optimal water quality parameter prediction model after the training is finished, and predicting the total nitrogen concentration of the river to be predicted in a test set;
and 4, step 4: and evaluating the performance of the model by using model evaluation indexes comprising average absolute error, average absolute percentage error, root mean square error and decision coefficient.
2. The method for predicting the total nitrogen concentration of the river based on the hybrid neural network as claimed in claim 1, wherein: the river water quality data to be predicted in the data cleaning method in the step 1 comprises 9 water quality parameters including temperature (T, DEG C), PH, dissolved oxygen (COD, mg/L), biochemical oxygen demand (BOD, mg/L), turbidity (NTU), potassium permanganate (COD-Mn, mg/L), ammonia nitrogen (NH3-N, mg/L), total phosphorus (TP, mg/L) and total nitrogen (TN, mg/L).
3. The method for predicting the total nitrogen concentration of the river based on the hybrid neural network as claimed in claim 1, wherein: the data cleaning method in the step 1 has the functions of detecting abnormal values of river water quality data to be predicted and filling vacancy values;
wherein, the abnormal value detection of the water quality data to be predicted by using the soliton is mainly divided into two steps:
firstly, constructing an isolated forest; the method comprises the steps of recursively dividing a water quality data set to be predicted without considering the distance or the density of two samples of the water quality data set to be predicted, and constructing an isolated tree until all sample points are isolated;
secondly, calculating an abnormality score, wherein the formula is as follows:
Figure FDA0003118338050000011
Figure FDA0003118338050000012
wherein n represents the size of the water quality sample data x to be predicted, H (n-1) represents a harmonic function, and the harmonic function is estimated by ln (n-1) + 0.5772156649; c (n) represents the average path length of a binary tree constructed by the water quality sample data to be predicted of n samples, and the average path length is mainly used for normalization; e (h (x)) represents the mean value of the path lengths of the sample data x to be predicted in a plurality of iTrees; if the node score is closer to 1, the more likely the sample node is abnormal; if the score of the sample node is closer to 0, the node is normal;
after abnormal value detection is carried out on the water quality data to be detected, setting the water quality data to be detected as a null value; then filling the package by using a method based on the Langerhan method; finding out a polynomial function which can give out a plurality of known points which just pass through a two-dimensional plane according to the non-vacant m sample points of the water quality data set to be predicted, and obtaining observed values just at each observation point; realizing filled vacancy values according to the polynomial function; the polynomial function formula of the lagrange interpolation method is thus as follows:
Lm(x)=y0l0(x)+y1l1(x)+...+ymlm
wherein, ymRepresenting the true value, l, of the corresponding sample point in the river water quality dataset to be predictedi(x) Representing an interpolation function corresponding to a sample point in a river water quality data set to be predicted:
Figure FDA0003118338050000021
4. the method for predicting the total nitrogen concentration of the river based on the hybrid neural network as claimed in claim 1, wherein: the specific steps of constructing the prediction model of the river total nitrogen concentration in the step 2 comprise:
step 2.1, randomly dividing a water quality data set of a river area to be tested into a training set and a testing set;
step 2.2, constructing a characteristic learning module 1-DRCNN network, and inputting the training set serving as the current water quality parameter into the 1-DRCNN network; 1-DRCNN can excavate and extract potential nonlinear relation characteristics among the current water quality parameters to form effective low-dimensional characteristics; the method uses two one-dimensional convolution residual blocks (1-DConv _ block) as a main body structure of a feature learning module, wherein each residual block mainly comprises 3 convolution layers of 1x1, 3 batch normalization layers (BN layer) and three activation functions (Selu), and the sizes of a filter and a convolution kernel of each convolution layer are set to be 32;
before each 1x1 convolutional layer is input, a Batch Normalization layer (BN layer) is needed to be used for carrying out transformation reconstruction on the output of the upper layer, the BN calculates first-order and second-order statistics of each Batch, the intermediate output of the one-dimensional convolutional neural network progressive layer is continuously adjusted, the high-layer network is continuously adapted to the parameter updating of the low-layer network, the output of each layer of the network tends to be stable, and the problem that the convergence speed of the network is reduced due to the fact that the gradient of the high-layer network disappears is solved; example a as above (a)1,...,ad) The reconstruction formula is as follows:
Figure FDA0003118338050000022
to pair
Figure FDA0003118338050000023
And (3) carrying out reconstruction:
Figure FDA0003118338050000024
wherein,
Figure FDA0003118338050000031
represents the variance, β, of the sample(k)=E[a(k)]Representing the expectation of the sample;
in addition, the convolution layer of 1 × 1 performs convolution operations using convolution kernels having the same weight and different regions in the training set of the river water quality data input after reconstruction in the above-described embodiment, and learns the river water quality data in the above-described embodiment to remember all the positions in the training setGenerating a low-dimensional feature vector; output of each convolutional layer
Figure FDA0003118338050000032
Comprises the following steps:
Figure FDA0003118338050000033
Figure FDA0003118338050000034
f(x)=SELU=βmax(αezx-α,x)
in the formula,
Figure FDA0003118338050000035
is the input to the i-th neuron in the l-th layer network,
Figure FDA0003118338050000036
is the output of the ith neuron in the l-1 th layer network,
Figure FDA0003118338050000037
is a filter between the kth neuron of the l-1 th layer network and the ith neuron of the l-1 th layer network,
Figure FDA0003118338050000038
defined as the bias of the kth neuron of the l-th layer network; the nonlinear feature mapping of the convolutional layer is realized by adopting an activation function f (); wherein, alpha and beta are fixed values, beta is 1.05070098, alpha is 1.67326324;
step 2.3, the 1-DRCNN output is used as a BiGRU model input, the BiGRU network autonomously captures the dependence relationship of short time sequence data, long time data and context attributes, a forward and backward combined hidden state is obtained, the dependence of time water quality parameters before and after learning and integration is learned, the water quality data expressed by characteristics is further optimized, and the BiGRU-based characteristic modeling process is described as follows:
Figure FDA0003118338050000039
Figure FDA00031183380500000310
Figure FDA00031183380500000311
in the formula, a GRU function represents that the GRU network is adopted to carry out nonlinear conversion on the embodiment, and an input vector is coded into a corresponding GRU hidden state; w is at、vtRespectively representing the forward hidden layer state of the bidirectional BiGUR at the time t
Figure FDA00031183380500000312
And reverse hidden layer states
Figure FDA00031183380500000313
A weight; btA bias representing the hidden layer state at time t;
and 2.4, at the top layer of the model, taking the output of the BiGRU as the input of a full connection layer, wherein the full connection layer is used for generating a total nitrogen concentration predicted value in the river flow area to be predicted.
5. The method for predicting the total nitrogen concentration of the river based on the hybrid neural network as claimed in claim 1, wherein: step 3, training by using an Adam optimization algorithm, and obtaining an optimal river water quality prediction model after the training is finished, wherein the method specifically comprises the following steps:
step 3.1, in the model training process, using an optimizer Adma to adjust the weight and the deviation of the model, and calculating the error between the total nitrogen concentration of the river water output by the full-connection layer and the real concentration in the training set; when the error is smaller than a preset threshold value, finishing model training to obtain an optimal model;
and 3.2, predicting the total nitrogen concentration on the test set by using the trained optimal model to obtain a prediction result of the total nitrogen concentration.
6. The method for predicting the total nitrogen concentration of the river based on the hybrid neural network as claimed in claim 1, wherein: the performance indexes of the evaluation model in the step 4 are as follows:
Figure FDA0003118338050000041
Figure FDA0003118338050000042
Figure FDA0003118338050000043
Figure FDA0003118338050000044
where N is the size of the test set sample, yiIs a predicted value of the total nitrogen concentration of the model, YiIs the observed value (true value) of the ith total nitrogen concentration in the test set,
Figure FDA0003118338050000045
is the average value of the observed values of the total nitrogen concentration in the test set; if the result of MAE, MAPE, RMSE is closer to 0, R2The closer to 1, the higher the prediction accuracy of the constructed model.
CN202110669073.XA 2021-06-17 2021-06-17 River total nitrogen concentration prediction method based on hybrid neural network Pending CN113408799A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110669073.XA CN113408799A (en) 2021-06-17 2021-06-17 River total nitrogen concentration prediction method based on hybrid neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110669073.XA CN113408799A (en) 2021-06-17 2021-06-17 River total nitrogen concentration prediction method based on hybrid neural network

Publications (1)

Publication Number Publication Date
CN113408799A true CN113408799A (en) 2021-09-17

Family

ID=77684502

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110669073.XA Pending CN113408799A (en) 2021-06-17 2021-06-17 River total nitrogen concentration prediction method based on hybrid neural network

Country Status (1)

Country Link
CN (1) CN113408799A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936721A (en) * 2022-07-15 2022-08-23 北京师范大学 Method and device for determining discharge amount of nitrous oxide in river and electronic equipment
CN115392125A (en) * 2022-08-29 2022-11-25 广东工业大学 Temperature prediction method for rotary cement kiln
CN115389812A (en) * 2022-10-28 2022-11-25 国网信息通信产业集团有限公司 Artificial neural network short-circuit current zero prediction method and prediction terminal
CN118345334A (en) * 2024-06-17 2024-07-16 华兴源创(成都)科技有限公司 Film thickness correction method and device and computer equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116080A (en) * 2020-09-24 2020-12-22 中国科学院沈阳计算技术研究所有限公司 CNN-GRU water quality prediction method integrated with attention mechanism

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116080A (en) * 2020-09-24 2020-12-22 中国科学院沈阳计算技术研究所有限公司 CNN-GRU water quality prediction method integrated with attention mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
闫健卓等: "基于1-DRCNN和BiGRU混合神经网络模型的滦河水质预测", 《WATER》, pages 1 - 19 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114936721A (en) * 2022-07-15 2022-08-23 北京师范大学 Method and device for determining discharge amount of nitrous oxide in river and electronic equipment
CN115392125A (en) * 2022-08-29 2022-11-25 广东工业大学 Temperature prediction method for rotary cement kiln
CN115389812A (en) * 2022-10-28 2022-11-25 国网信息通信产业集团有限公司 Artificial neural network short-circuit current zero prediction method and prediction terminal
CN118345334A (en) * 2024-06-17 2024-07-16 华兴源创(成都)科技有限公司 Film thickness correction method and device and computer equipment

Similar Documents

Publication Publication Date Title
CN113408799A (en) River total nitrogen concentration prediction method based on hybrid neural network
Latif et al. Application of artificial neural network for forecasting nitrate concentration as a water quality parameter: a case study of Feitsui Reservoir, Taiwan
Kuo et al. A hybrid neural–genetic algorithm for reservoir water quality management
Huo et al. Using artificial neural network models for eutrophication prediction
CN106198909A (en) A kind of aquaculture water quality Forecasting Methodology based on degree of depth study
CN107506857B (en) Urban lake and reservoir cyanobacterial bloom multivariable prediction method based on fuzzy support vector machine
Liu et al. Prediction of dissolved oxygen content in aquaculture of Hyriopsis cumingii using Elman neural network
CN109828089A (en) DBN-BP-based water quality parameter nitrous acid nitrogen online prediction method
Si et al. Modeling soil water content in extreme arid area using an adaptive neuro-fuzzy inference system
Li et al. A new ANN-Markov chain methodology for water quality prediction
CN114242156A (en) Real-time prediction method and system for relative abundance of pathogenic vibrios on marine micro-plastic
CN116611580A (en) Ocean red tide prediction method based on multi-source data and deep learning
CN117164103A (en) Intelligent control method, terminal and system of domestic sewage treatment system
Cui et al. A secondary modal decomposition ensemble deep learning model for groundwater level prediction using multi-data
Anh et al. Wavelet-artificial neural network model for water level forecasting
CN110909492B (en) Sewage treatment process soft measurement method based on extreme gradient lifting algorithm
CN112862173B (en) Lake and reservoir cyanobacterial bloom prediction method based on self-organizing deep confidence echo state network
Hu et al. BOD5 prediction based on PSO optimized BP neural network
CN114417227A (en) Method for predicting concentration of chlorophyll a in water body
Xu et al. Prediction of the Wastewater's pH Based on Deep Learning Incorporating Sliding Windows.
Kılıç et al. Water quality prediction of Asi River using fuzzy based approach
Dheda et al. A multivariate water quality parameter prediction model using recurrent neural network
Li et al. Enhanced Prediction of Dissolved Oxygen Concentration using a Hybrid Deep Learning Approach with Sinusoidal Geometric Mode Decomposition
Chen et al. Water quality prediction of artificial intelligence model: a case of Huaihe River Basin, China
Wang et al. Prediction method of cyanobacterial blooms spatial-temporal sequence based on deep belief network and fuzzy expert system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination