CN114944203A - Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning - Google Patents

Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning Download PDF

Info

Publication number
CN114944203A
CN114944203A CN202210508224.8A CN202210508224A CN114944203A CN 114944203 A CN114944203 A CN 114944203A CN 202210508224 A CN202210508224 A CN 202210508224A CN 114944203 A CN114944203 A CN 114944203A
Authority
CN
China
Prior art keywords
hyper
data set
prediction
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210508224.8A
Other languages
Chinese (zh)
Inventor
黄明智
李峻朗
李小勇
牛国强
易晓辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202210508224.8A priority Critical patent/CN114944203A/en
Publication of CN114944203A publication Critical patent/CN114944203A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/10Analysis or design of chemical reactions, syntheses or processes
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/18Water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Analytical Chemistry (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Biochemistry (AREA)
  • Food Science & Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to an anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning, which comprises the following steps: s100, selecting historical detection data of an anaerobic treatment unit of a sewage treatment plant, wherein the historical detection data comprises an input variable and an output variable, and recombining the historical detection data into a data set; s200, automatically optimizing the hyper-parameters of the bidirectional gating circulation unit model by using a tree structure Parzen estimation method, and inputting the optimal hyper-parameters into the bidirectional gating circulation unit model for training to obtain an optimal model; s300, inputting the test data set into the trained bidirectional gating circulation unit model to obtain a point prediction result of the output variable; s400, inputting the point prediction result of the output variable into the trained GPR model. The apparatus includes a memory and a processor that implements the method when executing instructions stored in the memory.

Description

Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning
Technical Field
The invention relates to an anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning, and belongs to the technical field of water body detection.
Background
In the food industry, the processes associated with food production produce large amounts of organic waste water. These organic waste waters are characterized by high Chemical Oxygen Demand (COD), high biodegradability, etc., and generally degrade food waste waters by anaerobic biochemical methods.
The Anaerobic biochemical process of the sewage treatment plant comprises an Up-flow Anaerobic Sludge Bed (Up-flow Anaerobic Sludge Bed) and an internal circulation Anaerobic reactor (internal circulation), the processes are relatively mature, however, the mechanism of a reaction system is quite complex, if the system is not monitored and maintained, the phenomena of rancidity, Sludge washing out, insufficient gas production and the like can occur to the system, and the reaction system is collapsed.
For the monitoring of the existing anaerobic reaction system, operators usually measure the water quality indexes of inlet and outlet water at intervals, however, for part of indexes such as COD (chemical oxygen demand) and the like, the measuring process is long, the monitoring data has certain hysteresis, and the system abnormality is difficult to predict or discover in time; and the water quality is influenced by a plurality of factors, so that the water quality presents high nonlinearity, volatility and uncertainty characteristics, and the system abnormity can be difficultly predicted.
In recent years, machine learning, particularly deep learning, is rapidly developing, and a technology for predicting the water quality of inlet and outlet water of a sewage treatment plant by using a neural network has been paid attention to by extensive researchers. The Bidirectional Gated recovery Units (bigrus) in the deep learning method have high point prediction accuracy when solving the prediction problem of time series such as wastewater treatment water quality, but the interval prediction and probability prediction cannot be carried out, and the manual adjustment of model hyper-parameters is time-consuming and labor-consuming. Specifically, the gated cyclic unit GRU is an improved version of a Recurrent Neural Network (RNN), and solves the long-term dependence problem of the RNN in solving the time series prediction problem by adding a reset gate and an update gate in a hidden layer of the RNN. The BiGRU adds a back propagation mechanism on the basis of a GRU network, so that the output of the current moment can be linked with the state of the previous moment and the state of the next moment, and the prediction precision is improved.
In addition, the traditional neural network construction process inevitably goes through the step of manually adjusting the network hyper-parameters, and research and development personnel need to manually adjust each hyper-parameter of the neural network according to the prediction precision of the neural network after one training, so that time and labor are wasted, a local optimal predicament is possibly caused, and a global optimal model cannot be obtained. Random search and grid search are also methods for automatically optimizing parameters, but have the disadvantages of time-consuming determination due to the combined randomness and global ergodicity of the search respectively. The tree structure Parzen estimation method TPE is a parameter automatic optimization method based on Bayesian optimization, and compared with random search and grid search, the tree structure Parzen estimation method TPE can find the optimal hyper-parameter in a shorter time by using heuristic search with strategies. One of the issues that has been studied is how to apply TPE to the multiparameter adjustment process of BiGRU.
Therefore, how to enable the BiGRU to have the capability of interval prediction and probability prediction and overcome the defect of manually adjusting the hyper-parameters is a theoretical and practical engineering problem which needs to be solved urgently.
Disclosure of Invention
The invention provides an anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning, and aims to at least solve one of the technical problems in the prior art.
The technical scheme of the invention relates to an anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning, which comprises the following steps:
s100, selecting historical detection data of an anaerobic treatment unit of a sewage treatment plant, wherein the historical detection data comprises an input variable and an output variable, recombining the historical detection data into a data set, dividing the data set into a training data set and a test data set, and then performing data preprocessing on the training data set and the test data set;
s200, constructing an automatic optimization-seeking bidirectional gating circulation unit model, inputting the training data set into the bidirectional gating circulation unit model for training, automatically seeking the hyper-parameters of the bidirectional gating circulation unit model by using a tree structure Parzen estimation method to obtain optimal hyper-parameters, and inputting the optimal hyper-parameters into the bidirectional gating circulation unit model for training to obtain an optimal model;
s300, inputting the test data set into the trained bidirectional gating circulation unit model to obtain a point prediction result of the output variable;
s400, constructing a Gaussian process regression GPR model, inputting the point prediction result of the output variable into the trained Gaussian process regression GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, and determining a prediction interval and probability prediction corresponding to the point prediction result of the output variable based on the probability distribution function.
Further, the input variables comprise COD, pH value, volatile fatty acid concentration value, organic matter load content and alkalinity in the water body entering the wastewater, and the output variables comprise COD and gas production in the water body flowing out of the wastewater.
Further, wherein the step S100 includes:
s110, selecting the historical detection data of a historical time period, and recombining the data set by the historical detection data in the form of a data set matrix with the following formula:
Figure BDA0003638247470000021
the row vectors of the data set matrix are wastewater index data of different types, the column vectors represent wastewater index data of the same type at different historical times, m is the data length of the wastewater index data of each type in the data set matrix, and k is the number of the wastewater indexes.
Further, wherein the step S100 includes:
s120, dividing the data set into a training data set and a test data set according to the ratio of 8:2, constructing the training data set and the test data set once,
the training data set is used for training the bidirectional gating cycle unit model, and the testing data set is used for verifying the prediction accuracy of the bidirectional gating cycle unit model;
s130, screening and eliminating abnormal values in the data set, and then carrying out normalization processing on the data set through the following formula:
Figure BDA0003638247470000031
where i is the dimension of the input variable, X 1i,0 Is the original data value of the ith dimension in the input variable, minX 1i,0 Is the minimum value of the i-th dimension raw data in the input variables, maxX 1i,0 Is the maximum value, X, of the i-th dimension of the original data in the input variables 1i Normalized data values for the ith dimension in the input variables.
Further, wherein the step S200 includes:
s210, setting neural network hyper-parameters of the bidirectional gating circulation unit model,
wherein, part of the neural network hyper-parameters are selected as hyper-parameters to be optimized,
and wherein the bidirectional gated cyclic unit model comprises an input layer, a hidden layer and an output layer, the hidden layer comprising an attention mechanism layer, a bidirectional layer and a GRU layer;
s220, setting a search space of the hyper-parameter to be optimized;
s230, combining the hyper-parameters in the search space of the hyper-parameters;
s240, training the bidirectional gating circulation unit model, automatically optimizing the hyper-parameters according to the accuracy of multiple times of training, and finally obtaining the optimal hyper-parameters and the optimal model.
Further wherein the hyper-parameters comprise neuron number of input layer, neuron number of hidden layer, neuron number of output layer, learning rate, batch size and iteration number.
Further, wherein the step S240 includes:
s241, according to the set optimized hyperparameters and the corresponding hyperparameter space, in the first training, randomly searching and combining the hyperparameters in the hyperparameter space by using a tree structure Parzen estimation method, and creating an error observation value set { x } (i) ,y (i) ,i=1,2,...,N init },
Wherein x is a hyper-parameter combination to be optimized, and y is an error value trained by using the corresponding hyper-parameter combination;
s242, setting an error quantile value y by utilizing a tree structure Parzen estimation method according to the result and the training precision of the bidirectional gating circulation unit model trained for the previous N times * Dividing the error observed value set into two parts, and calculating the probability value of the error observed value set by the following formula:
Figure BDA0003638247470000041
wherein l (x) is a probability density function of the hyper-parameter set having an error value less than y, and g (x) is a probability density function of the hyper-parameter set having an error value less than y;
s243, calculating the EI value through the following formula:
Figure BDA0003638247470000042
s244, selecting the next group of super parameter values by maximizing the EI value;
and S245, repeating the steps from the step 241 to the step 244 until the bidirectional gating circulation unit model reaches the set training times and then ending.
Further, wherein the step S400 comprises the steps of:
s410, taking the input variables in the previous training set and the previous testing set as the input variables in the first training set and the previous testing set, and constructing the output variables of the next training set and the next testing set according to the point prediction results of the output variables;
s420, inputting the training set and the test set of the second time into the GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, wherein the probability distribution function obeys Gaussian distribution;
and S430, determining a prediction interval of each point prediction result under a preset confidence degree based on the mean value, the standard deviation and the preset confidence degree of the probability distribution function.
Further, the calculated point prediction, interval prediction and probability prediction indexes are compared with a BiGRU-GPR model, a BiLSTM-GPR model and a Gaussian process regression GPR model to obtain a better model, and the point prediction indexes are calculated according to the following formula:
Figure BDA0003638247470000051
Figure BDA0003638247470000052
Figure BDA0003638247470000053
wherein, y i Is the (i) th observed value(s),
Figure BDA0003638247470000054
is the average of the ith observation,
Figure BDA0003638247470000055
is the ith prediction value, and n is the number of prediction samples;
the index of the interval prediction is calculated according to the following formula:
Figure BDA0003638247470000056
Figure BDA0003638247470000057
MC=MWP/CP
wherein, y upper,i Is the upper limit of the prediction interval of the predicted value of the ith point, y lower,i Is the lower limit of the prediction interval of the predicted value of the ith point, n j Is the number of observed values within the prediction interval,
the index of the probability prediction is calculated according to the following formula:
Figure BDA0003638247470000058
Figure BDA0003638247470000059
Figure BDA00036382474700000510
wherein, F (y) i ) Is a cumulative distribution function of the predicted values,
Figure BDA00036382474700000511
is a unit step function.
The invention also relates to a computer device comprising a memory and a processor, wherein the processor executes a computer program stored in the memory to implement the method.
The beneficial effects of the invention are as follows.
1. The method can predict the COD (chemical oxygen demand) in the effluent and the gas yield of the water body gas according to the change of each index in the anaerobic reaction process, and simultaneously gives interval prediction and probability prediction corresponding to an outlet point prediction result, and the output result has good credibility.
2. Aiming at pain points of researchers needing to manually adjust the neural network hyper-parameters, the invention introduces a tree structure Parzen estimation method to automatically optimize the neural network hyper-parameters so as to enable the neural network model to reach the optimal state.
Drawings
Fig. 1 is a general flow diagram of a method according to the invention.
Fig. 2 is a schematic diagram of a structure of a GRU hidden layer in an embodiment of the present invention.
Fig. 3 is a schematic diagram of an implementation of data set partitioning and prediction according to an embodiment of the present invention.
Fig. 4 is a diagram of a BiGRU neural network structure according to an embodiment of the present invention.
Fig. 5 is a graph comparing COD point prediction results according to an embodiment of the present invention.
FIG. 6 is a diagram showing the result of prediction of COD interval in the example according to the present invention.
Fig. 7 is a graph of gas production point prediction results in an embodiment in accordance with the invention.
Fig. 8 is a graph of the prediction result of the gas production interval according to the embodiment of the present invention.
FIG. 9 is a table comparing point prediction results for three models according to an embodiment of the present invention.
Fig. 10 is a comparison table of interval prediction results of three models according to the embodiment of the present invention.
FIG. 11 is a table comparing the results of probability predictions for three models in an embodiment in accordance with the invention.
Detailed Description
The conception, the specific structure and the technical effects produced by the present invention will be clearly and completely described in conjunction with the embodiments and the attached drawings, so as to fully understand the objects, the schemes and the effects of the present invention.
The terms and expressions referred to in the embodiments of the present invention are used for the following explanations:
tree-structured Parzen Estimator, TPE: parzen estimation method based on tree structure
Bidirectional Gated reverse Units, BiGRU: bidirectional gate control circulation unit
Gaussian Progress Regression, GPR: gauss process regression
COD: chemical Oxygen Demand (COD)
VFA: volatile fatty acid concentration value
OLR: organic matter load content
ALK: alkalinity of
And (3) CP: coverage of zones
MWP: average width of interval
CRPS: sequential probability ranking scores
Referring to fig. 1 to 11, in some embodiments, the present invention discloses an anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning, the method comprising the following steps, referring to the flowchart of fig. 1:
s100, selecting historical detection data of an anaerobic treatment unit of a sewage treatment plant, wherein the historical detection data comprises an input variable and an output variable, recombining the historical detection data into a data set, dividing the data set into a training data set and a test data set, and then performing data preprocessing on the training data set and the test data set;
s200, constructing an automatic optimization-seeking bidirectional gating cycle unit model, inputting the training data set into the bidirectional gating cycle unit model for training, automatically seeking the hyper-parameters of the bidirectional gating cycle unit model by using a tree structure Parzen estimation method to obtain optimal hyper-parameters, and inputting the optimal hyper-parameters into the bidirectional gating cycle unit model for training to obtain an optimal model;
s300, inputting the test data set into the trained bidirectional gating circulation unit model to obtain a point prediction result of the output variable;
s400, constructing a Gaussian process regression GPR model, inputting the point prediction result of the output variable into the trained Gaussian process regression GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, and determining a prediction interval and probability prediction corresponding to the point prediction result of the output variable based on the probability distribution function.
For step S100
S100, selecting historical detection data of an anaerobic treatment unit of a sewage treatment plant, wherein the historical detection data comprises an input variable and an output variable, recombining the historical detection data into a data set, taking the data set as a deep learning data set, dividing a training data set and a test data set by the data set, and then performing data preprocessing on the training data set and the test data set.
And determining according to historical detection data that the input variables comprise COD (chemical oxygen demand), a pH value, a volatile fatty acid concentration value, an organic matter load content and alkalinity of the water body entering the wastewater, and the output variables comprise COD and gas production rate of the water body flowing out of the wastewater.
S110, selecting the historical detection data in a historical period, and recombining the historical detection data into a data set in a form of a data set matrix with the following formula:
Figure BDA0003638247470000071
the row vectors of the data set matrix are wastewater index data of different types, the column vectors represent wastewater index data of the same type at different historical times, m is the data length of the wastewater index data of each type in the data set matrix, and k is the number of the wastewater indexes.
And S120, dividing a training data set and a testing data set according to the ratio of 8:2, and then constructing the training data set and the testing data set once. Wherein the input variables of the primary training data set are
Figure BDA0003638247470000072
The output variable is
Figure BDA0003638247470000073
The input variables of the test data set at a time are
Figure BDA0003638247470000074
The output variable is
Figure BDA0003638247470000075
The training data set at one time is used for training the bidirectional gated loop unit model, and the testing data set at one time is used for verifying the prediction accuracy of the bidirectional gated loop unit model.
S130, screening and eliminating abnormal values in the data set, and then carrying out normalization processing on the data set through the following formula:
Figure BDA0003638247470000081
where i is the dimension of the input variable, X 1i,0 Is the original data value of the ith dimension in the input variable, minX 1i,0 Is the minimum value of the i-th dimension raw data in the input variable, maxX 1i,0 Is the maximum value of the ith dimension raw data in the input variable, X 1i Normalized data values for the ith dimension in the input variables.
For step S200
S200, constructing an automatic optimization-seeking bidirectional gating circulation unit model, inputting the training data set into the bidirectional gating circulation unit model for training, automatically seeking the hyper-parameters of the bidirectional gating circulation unit model by using a tree structure Parzen estimation method to obtain optimal hyper-parameters, and inputting the optimal hyper-parameters into the bidirectional gating circulation unit model for training to obtain an optimal model.
S210, referring to FIG. 4, setting neural network hyper-parameters of the bidirectional gating circulation unit model, and selecting part of the neural network hyper-parameters as hyper-parameters to be optimized.
And constructing a bidirectional gating circulation unit (BiGRU) model, wherein the bidirectional gating circulation unit model comprises an input layer, a hidden layer and an output layer, and the hidden layer comprises an Attention mechanism layer (Attention layer), a bidirectional layer and a GRU layer.
S220, setting a search space of the hyper-parameters to be optimized, wherein the hyper-parameters comprise input layer neuron number n i Number of neurons in hidden layer n h Number of neurons in output layer n o Learning rate L, batch size B and iteration number E. Wherein n is i And n o The numbers of input variables and output variables, namely 5 and 1, respectively.
S230, combining each hyper-parameter, n, in the search space of the hyper-parameter o L, B and E are assigned to TPE to-be-optimized hyper-parameters, whose optimization ranges are set separately before training. The optimizer of the bidirectional gated cyclic unit (BiGRU) model is Adam, and the loss function is the mean square error MSE.
S240, training the bidirectional gating circulation unit model, automatically optimizing the super-parameters according to the accuracy of multiple times of training, and finally obtaining the optimal super-parameters and the optimal model. The calculation formula of the forward propagation of the information at time t is as follows, and is shown with reference to fig. 2.
r t =σ(W r ·h t-1 +W r ·x t )
z t =σ(W z ·h t-1 +W z ·x t )
Figure BDA0003638247470000082
Figure BDA0003638247470000083
y t =σ(W o ·h t )
Wherein z is t Is to update the door r t Is a reset gate, x t Is the information entered, h t Is the current state, h t-1 The former oneThe information of the state of the device,
Figure BDA0003638247470000084
is a candidate set, y t Is the output information, W r 、W z
Figure BDA0003638247470000085
And W O Indicating the corresponding weights, σ (-), and tanh (-), respectively, are activation functions, as the product of the matrix.
Wherein, W r 、W z And
Figure BDA0003638247470000091
all are spliced and are divided by the following formula:
W r =W rx +W rh
W z =W zx +W zh
Figure BDA0003638247470000092
the calculation formula of error back propagation at the time t is as follows:
first, the loss function of the network transfer at time t is calculated:
Figure BDA0003638247470000093
the loss function at all times is then:
Figure BDA0003638247470000094
then, the error of the output layer is calculated, and the partial derivatives of the loss function to each parameter are calculated:
Figure BDA0003638247470000095
Figure BDA0003638247470000096
Figure BDA0003638247470000097
Figure BDA0003638247470000098
Figure BDA0003638247470000099
Figure BDA00036382474700000910
Figure BDA00036382474700000911
after calculating the partial derivatives of each parameter, the parameters can be updated, and the processes are iterated in sequence until the loss is converged, namely the training is completed.
After finishing the training of the two-way gating circulation unit model once, the hyper-parameters can be adjusted according to the output result and the output precision, the automatic optimization is finished until the set training times is reached, the optimal model is output, the two-way gating circulation unit model (TPE) is based on the principle of the following steps, and the step S240 comprises the following steps:
s241, according to the set optimized hyper-parameters and the corresponding hyper-parameter space, in the first training, randomly searching and combining each hyper-parameter in the hyper-parameter space by using a tree structure Parzen estimation method, and creating an error observation value set { x } (i) ,y (i) ,i=1,2,...,N init },
Wherein x is a hyper-parameter combination to be optimized, and y is an error value trained by using the corresponding hyper-parameter combination;
s242, setting an error quantile value y by utilizing a tree structure Parzen estimation method according to the result and the training precision of the bidirectional gating circulation unit model trained for the previous N times * Dividing the error observed value set into two parts, and calculating the probability value of the error observed value set by the following formula:
Figure BDA0003638247470000101
wherein l (x) is a probability density function of the hyper-parameter set with error values less than y, g (x) is a probability density function of the hyper-parameter set with error values less than y;
s243, calculating the EI value through the following formula:
Figure BDA0003638247470000102
s244, selecting the next group of super parameter values by maximizing the EI value;
and S245, repeating the steps S241 to S244 until the bidirectional gating cycle unit model reaches the set training times and then ending.
For step S300
S300, inputting the test data set into the trained bidirectional gating circulation unit model to obtain a point prediction result of the output variable.
For step S400
S400, constructing a Gaussian process regression GPR model, inputting the point prediction result of the output variable into the trained Gaussian process regression GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, and determining a prediction interval and probability prediction corresponding to the point prediction result of the output variable based on the probability distribution function.
The step S400 includes the steps of:
s410, taking the input variables in the previous training set and the previous testing set as the input variables in the first training set and the first testing set, and constructing the output variables of the next training set and the second testing set according to the point prediction results of the output variables.
And S420, inputting the training set and the test set of the second time into the GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, wherein the probability distribution function obeys Gaussian distribution. Wherein the probability distribution function of the ith sample point in the quadratic test set
Figure BDA0003638247470000103
Wherein,
Figure BDA0003638247470000104
and obtaining the point prediction, interval prediction and probability prediction results of the ith sample point.
And S430, determining a prediction interval of each point prediction result under a preset confidence degree based on the mean value, the standard deviation and the preset confidence degree of the probability distribution function.
In the process of calculating the probability distribution function, specifically, in conjunction with fig. 3, for the sake of simplifying the description, the following will be made
Figure BDA0003638247470000111
It is assumed that x is the number of,
Figure BDA0003638247470000112
it is set to be y,
Figure BDA0003638247470000113
is set to x *
Figure BDA0003638247470000114
Is set to f (x) * )=y *
In the presence of noise at an observation point, it is assumed that the noise obeys a mean of 0 and a variance of 0
Figure BDA0003638247470000115
The normal distribution of (a) is then:
Figure BDA0003638247470000116
gaussian process regression assumes that the sample function values follow a joint normal distribution, i.e.:
Figure BDA0003638247470000117
where K is a kernel function, and are symbols distinguishing different parameters K, I N Is an identity matrix.
Y is calculated according to the following Bayesian regression equation * The posterior probability of (2):
Figure BDA0003638247470000118
Figure BDA0003638247470000119
Figure BDA00036382474700001110
wherein the present order
Figure BDA00036382474700001111
I.e. to derive the sought probability distribution function.
Then, the upper and lower limits of the 95% prediction interval of the ith sample point are calculated by the following formula:
Figure BDA00036382474700001112
Figure BDA00036382474700001113
wherein the noun in the calculation point predictor explains:
(1) determiningCoefficient R 2 : the method is used for measuring the deviation degree of the predicted value and the true value, and the closer to 1, the more consistent the predicted value and the true value is.
(2) Root mean square error RMSE: the square root of the ratio of the square sum of the deviation of the predicted value and the observed value to the observation times is calculated, and the error of the predicted value is larger when the RMSE is larger.
(3) Mean absolute percent error MAPE: the method is used for calculating the percentage of the average absolute error between the predicted value and the observed value, the smaller the MAPE is, the more perfect the prediction model is, and the greater the MAPE is, the inferior model is.
The point prediction index is calculated according to the following formula:
Figure BDA0003638247470000121
Figure BDA0003638247470000122
Figure BDA0003638247470000123
wherein, y i Is the i-th observed value and,
Figure BDA0003638247470000124
is the average of the ith observation,
Figure BDA0003638247470000125
is the ith prediction value, and n is the number of prediction samples;
wherein the noun in the calculation interval prediction explains:
(1) section coverage rate CP: the method is used for calculating the percentage of the observation value covered by the prediction interval, and the closer CP is to 1, the more observation values covered by the prediction interval are;
(2) mean width of interval MWP: the method is used for calculating the average width of the prediction interval, and the smaller the MWP is, the higher the prediction reliability of the interval is;
(3) comprehensive index MC of interval prediction: and integrating indexes of MWP and CP, wherein the smaller the MC value is, the better the prediction effect of the interval is.
The index of the interval prediction is calculated according to the following formula:
Figure BDA0003638247470000126
Figure BDA0003638247470000127
MC=MWP/CP
wherein, y upper,i Is the upper limit of the prediction interval of the predicted value of the ith point, y lower,i Is the lower limit of the prediction interval of the predicted value of the ith point, n j Is the number of observed values within the prediction interval,
wherein the noun interpretation in calculating the probability predictor:
(1) successive probability ranking score CRPS: for measuring the difference between the predicted distribution and the real distribution, the closer CRPS is to 0, which shows that the predicted distribution is consistent with the real distribution.
The index of the probability prediction is calculated according to the following formula:
Figure BDA0003638247470000131
Figure BDA0003638247470000132
Figure BDA0003638247470000133
wherein, F (y) i ) Is a cumulative distribution function of the predicted values,
Figure BDA0003638247470000134
is a unit step function.
And comparing the calculated point prediction, interval prediction and probability prediction indexes with a BiGRU-GPR model, a BiLSTM-GPR model and a Gaussian process regression GPR model to obtain a better model, and combining the better model with figures 5 to 8.
Specifically, referring to the table of fig. 8, the point prediction results are shown: in point prediction, for the prediction of COD and gas production, the prediction accuracy of the three models is very high, the matching degree of a prediction curve and an observed value curve is very high, and the method for the deep learning of the BiGRU-GPR model and the BiLSTM-GPR model automatically optimized by using the tree structure Parzen estimation method (TPE) not only saves the complexity of manually adjusting the hyper-parameters, but also can obtain very high prediction accuracy. The three indexes of the BiGRU-GPR model are superior to those of the BiLSTM-GPR model and the Gaussian process regression GPR model, and the point prediction result obtained by the method provided by the scheme of the invention is highest in precision.
Referring to the interval prediction results in the table of fig. 9: in interval prediction, the comparison of the above three models shows the same trend for the prediction of COD and gas production: the BiLSTM-GPR model is superior to the BiLSTM-GPR model in a little way; for MWP, the prediction interval of the BiGRU-GPR model is narrowest; for the lowest comprehensive index MC and the lowest BiGRU-GPR, the method provided by the invention has the advantage that the interval prediction result obtained by the method provided by the invention has the highest accuracy and reliability.
Referring to the table of FIG. 10, the probability prediction results are shown in the table: in probability prediction, the CRPS value of the BiGRU-GPR model is the minimum and is 0.0329 for COD; for gas production, the CRPS value of the BilSTM-GPR model was the smallest at 0.1297. The method of the invention has the advantage that the obtained predicted value distribution has higher consistency with the real value distribution.
It should be recognized that the method steps in embodiments of the present invention may be embodied or carried out by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The method may use standard programming techniques. Each program may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, the operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated onto a computing platform, such as a hard disk, optically read and/or write storage media, RAM, ROM, etc., so that it is readable by a programmable computer, which when read by the computer can be used to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described herein includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention may also include the computer itself when programmed according to the methods and techniques described herein.
A computer program can be applied to input data to perform the functions described herein to transform the input data to generate output data that is stored to non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the invention, the transformed data represents physical and tangible objects, including particular visual depictions of physical and tangible objects produced on a display.
The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention as long as the technical effects of the present invention are achieved by the same means. The invention is capable of other modifications and variations in its technical solution and/or its implementation, within the scope of protection of the invention.

Claims (10)

1. An anaerobic wastewater treatment intelligent monitoring method based on an automatic optimization algorithm and deep learning is characterized by comprising the following steps:
s100, selecting historical detection data of an anaerobic treatment unit of a sewage treatment plant, wherein the historical detection data comprises an input variable and an output variable, recombining the historical detection data into a data set, dividing the data set into a training data set and a test data set, and then performing data preprocessing on the training data set and the test data set;
s200, constructing an automatic optimization-seeking bidirectional gating circulation unit model, inputting the training data set into the bidirectional gating circulation unit model for training, automatically seeking the hyper-parameters of the bidirectional gating circulation unit model by using a tree structure Parzen estimation method to obtain optimal hyper-parameters, and inputting the optimal hyper-parameters into the bidirectional gating circulation unit model for training to obtain an optimal model;
s300, inputting the test data set into the trained bidirectional gating cycle unit model to obtain a point prediction result of the output variable;
s400, constructing a Gaussian process regression GPR model, inputting the point prediction result of the output variable into the trained Gaussian process regression GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, and determining a prediction interval and probability prediction corresponding to the point prediction result of the output variable based on the probability distribution function.
2. The method of claim 1, wherein the input variables include COD, pH, volatile fatty acid concentration value, organic load content, and alkalinity in the body of water entering the wastewater, and the output variables include COD and gas production in the body of water exiting the wastewater.
3. The method of claim 1, wherein the step S100 comprises:
s110, selecting the historical detection data in a historical period, and recombining the historical detection data into a data set in a form of a data set matrix with the following formula:
Figure FDA0003638247460000011
the row vectors of the data set matrix are wastewater index data of different types, the column vectors represent wastewater index data of the same type at different historical times, m is the data length of each type of wastewater index data in the data set matrix, and k is the number of wastewater indexes.
4. The method of claim 3, wherein the step S100 comprises:
s120, dividing the data set into a training data set and a test data set according to the ratio of 8:2, constructing the training data set and the test data set once,
the training data set is used for training the bidirectional gated cyclic unit model, and the testing data set is used for verifying the accuracy of prediction of the bidirectional gated cyclic unit model;
s130, screening and eliminating abnormal values in the data set, and then carrying out normalization processing on the data set through the following formula:
Figure FDA0003638247460000021
where i is the dimension of the input variable, X 1i,0 Is the original data value of the ith dimension in the input variable, min X 1i,0 Is the minimum value of the ith dimension raw data in the input variable, max X 1i,0 Is the maximum value, X, of the i-th dimension of the original data in the input variables 1i Normalized data values for dimension i in the input variables.
5. The method of claim 1, wherein the step S200 comprises:
s210, setting neural network hyper-parameters of the bidirectional gating circulation unit model,
wherein, part of the neural network hyper-parameters are selected as the hyper-parameters to be optimized,
and wherein the bidirectional gated cyclic unit model comprises an input layer, a hidden layer and an output layer, the hidden layer comprising an attention mechanism layer, a bidirectional layer and a GRU layer;
s220, setting a search space of the hyper-parameter to be optimized;
s230, combining the hyper-parameters in the search space of the hyper-parameters;
s240, training the bidirectional gating circulation unit model, automatically optimizing the hyper-parameters according to the accuracy of multiple times of training, and finally obtaining the optimal hyper-parameters and the optimal model.
6. The method of claim 5, wherein the hyper-parameters include neuron number of input layer, neuron number of hidden layer, neuron number of output layer, learning rate, batch size, and number of iterations.
7. The method of claim 5, wherein the step S240 comprises:
s241, according to the set optimized hyper-parameter and the corresponding hyper-parameter space, in the first training, utilizing the tree shapeThe structure Parzen estimation method randomly searches and combines each hyper-parameter in the hyper-parameter space and creates an error observation value set { x } (i) ,y (i) ,i=1,2,...,N init },
Wherein x is a hyper-parameter combination to be optimized, and y is an error value trained by using the corresponding hyper-parameter combination;
s242, setting an error quantile value y by utilizing a tree structure Parzen estimation method according to the result and the training precision of the bidirectional gating circulation unit model trained for the previous N times * Dividing the error observed value set into two parts, and calculating the probability value of the error observed value set by the following formula:
Figure FDA0003638247460000022
wherein l (x) is a probability density function of the hyper-parameter set with error values less than y, g (x) is a probability density function of the hyper-parameter set with error values less than y;
and S243, calculating an EI value through the following formula:
Figure FDA0003638247460000031
s244, selecting the next group of super parameter values by maximizing the EI value;
and S245, repeating the steps S241 to S244 until the bidirectional gating cycle unit model reaches the set training times and then ending.
8. The method of claim 1, wherein the step S400 comprises the steps of:
s410, taking the input variables in the previous training set and the previous testing set as the input variables in the first training set and the previous testing set, and constructing the output variables of the next training set and the next testing set according to the point prediction results of the output variables;
s420, inputting the training set and the test set of the second time into the GPR model to obtain a probability distribution function corresponding to the point prediction result of the output variable, wherein the probability distribution function obeys Gaussian distribution;
and S430, determining a prediction interval of each point prediction result under a preset confidence degree based on the mean value, the standard deviation and the preset confidence degree of the probability distribution function.
9. The method of claim 1, wherein,
comparing the calculated point prediction, interval prediction and probability prediction indexes with a BiGRU-GPR model, a BiLSTM-GPR model and a Gaussian process regression GPR model to obtain a better model,
the point prediction index is calculated according to the following formula:
Figure FDA0003638247460000032
Figure FDA0003638247460000033
Figure FDA0003638247460000034
wherein, y i Is the (i) th observed value(s),
Figure FDA0003638247460000035
is the average of the ith observation,
Figure FDA0003638247460000036
is the ith prediction value, and n is the number of prediction samples;
the index of the interval prediction is calculated according to the following formula:
Figure FDA0003638247460000041
Figure FDA0003638247460000042
MC=MWP/CP
wherein, y upper,i Is the upper limit of the prediction interval of the predicted value of the ith point, y lower,i Is the lower limit of the prediction interval of the predicted value of the ith point, n j Is the number of observed values within the prediction interval,
the index of the probability prediction is calculated according to the following formula:
Figure FDA0003638247460000043
Figure FDA0003638247460000044
Figure FDA0003638247460000045
wherein, F (y) i ) Is a cumulative distribution function of the predicted values,
Figure FDA0003638247460000046
is a unit step function.
10. A computer arrangement comprising a memory and a processor, wherein the processor implements the method of any one of claims 1 to 9 when executing a computer program stored in the memory.
CN202210508224.8A 2022-05-11 2022-05-11 Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning Pending CN114944203A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210508224.8A CN114944203A (en) 2022-05-11 2022-05-11 Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210508224.8A CN114944203A (en) 2022-05-11 2022-05-11 Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning

Publications (1)

Publication Number Publication Date
CN114944203A true CN114944203A (en) 2022-08-26

Family

ID=82907526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210508224.8A Pending CN114944203A (en) 2022-05-11 2022-05-11 Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning

Country Status (1)

Country Link
CN (1) CN114944203A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115855833A (en) * 2022-11-24 2023-03-28 中国中医科学院中药研究所 Hyperspectral imaging combined deep learning-based saponin content prediction method and system
CN115952685A (en) * 2023-02-02 2023-04-11 淮阴工学院 Sewage treatment process soft measurement modeling method based on integrated deep learning
CN116116181A (en) * 2023-04-18 2023-05-16 科扬环境科技有限责任公司 Model optimization-based waste gas and wastewater treatment method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115855833A (en) * 2022-11-24 2023-03-28 中国中医科学院中药研究所 Hyperspectral imaging combined deep learning-based saponin content prediction method and system
CN115952685A (en) * 2023-02-02 2023-04-11 淮阴工学院 Sewage treatment process soft measurement modeling method based on integrated deep learning
CN115952685B (en) * 2023-02-02 2023-09-29 淮阴工学院 Sewage treatment process soft measurement modeling method based on integrated deep learning
CN116116181A (en) * 2023-04-18 2023-05-16 科扬环境科技有限责任公司 Model optimization-based waste gas and wastewater treatment method and device

Similar Documents

Publication Publication Date Title
CN114944203A (en) Wastewater treatment monitoring method and system based on automatic optimization algorithm and deep learning
Katz et al. Integrating deep learning models and multiparametric programming
Wang et al. Advanced fault diagnosis method for nuclear power plant based on convolutional gated recurrent network and enhanced particle swarm optimization
CN113837356B (en) Intelligent sewage treatment prediction method based on fused neural network
Han et al. Hierarchical extreme learning machine for feedforward neural network
CN112884056A (en) Optimized LSTM neural network-based sewage quality prediction method
CN111767517B (en) BiGRU multi-step prediction method, system and storage medium applied to flood prediction
KR102440372B1 (en) Providing method, apparatus and computer-readable medium of managing influent environmental information of sewage treatment facilities based on big data and artificial intelligence
CN108732931A (en) A kind of multi-modal batch process modeling method based on JIT-RVM
CN106600001B (en) Glass furnace Study of Temperature Forecasting method based on Gaussian mixtures relational learning machine
CN113065703A (en) Time series prediction method combining multiple models
CN113723007A (en) Mechanical equipment residual life prediction method based on DRSN and sparrow search optimization BilSTM
CN113837364B (en) Sewage treatment soft measurement method and system based on residual network and attention mechanism
Zouhri et al. Handling the impact of feature uncertainties on SVM: A robust approach based on Sobol sensitivity analysis
CN110838364A (en) Crohn disease prediction method and device based on deep learning hybrid model
CN116484747A (en) Sewage intelligent monitoring method based on self-adaptive optimization algorithm and deep learning
Buragohain Adaptive network based fuzzy inference system (ANFIS) as a tool for system identification with special emphasis on training data minimization
CN107545101A (en) A kind of design object and the Optimization Design that design variable is section
Rad et al. GP-RVM: Genetic programing-based symbolic regression using relevance vector machine
CN115456245A (en) Prediction method for dissolved oxygen in tidal river network area
Qiao et al. Design of modeling error PDF based fuzzy neural network for effluent ammonia nitrogen prediction
Zhu et al. Time-varying interval prediction and decision-making for short-term wind power using convolutional gated recurrent unit and multi-objective elephant clan optimization
Souza et al. Co-evolutionary genetic multilayer perceptron for feature selection and model design
CN117291069A (en) LSTM sewage water quality prediction method based on improved DE and attention mechanism
Abed Al Raoof et al. Maximizing CNN Accuracy: A Bayesian Optimization Approach with Gaussian Processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination