CN115952685B

CN115952685B - Sewage treatment process soft measurement modeling method based on integrated deep learning

Info

Publication number: CN115952685B
Application number: CN202310053332.5A
Authority: CN
Inventors: 熊金琳; 彭甜; 李正波; 陶孜菡; 张楚; 赵环宇; 伏咏妍; 王宇涵; 黄小龙; 花磊
Original assignee: Huaiyin Institute of Technology
Current assignee: Cao Liang
Priority date: 2023-02-02
Filing date: 2023-02-02
Publication date: 2023-09-29
Anticipated expiration: 2043-02-02
Also published as: CN115952685A

Abstract

The invention discloses a sewage treatment process soft measurement modeling method based on integrated deep learning. Firstly, acquiring sewage data as auxiliary variables; then, selecting the acquired variables by using KPCA features and then taking the selected variables as the input of a model; establishing a sewage soft measurement integrated model, wherein the integrated model is provided with two layers, the first layer comprises three basic learners of BiLSTM, LSSVM and XGBoost, a 5-fold cross validation method is adopted for training, and the second layer adopts ELM as a meta-learner; and finally, carrying out error correction on the initial prediction result by adopting an extreme learning machine. In order to improve the performance of the model, an RSA algorithm is provided for optimizing model parameters; and according to the RSA algorithm, the Latin hypercube, nonlinear factors, golden sine and flip strategies are used for improving the RSA algorithm in the aspects of convergence accuracy, easy sinking into local optimum and the like. Compared with the traditional soft measurement method, the method can integrate the advantages of each model, has stronger generalization capability of the whole model and higher prediction precision.

Description

Sewage treatment process soft measurement modeling method based on integrated deep learning

Technical Field

The invention relates to the field of soft measurement modeling of industrial sewage treatment processes, in particular to a soft measurement modeling method of key water quality parameters of a sewage treatment process based on integrated deep learning.

Background

The wastewater treatment process is very complex, relates to a complex dynamic physical reaction process, biological reaction process and chemical reaction process, has strong nonlinearity, uncertainty, time-varying characteristic and extensive hysteresis, and is difficult to establish an accurate model. In order to maintain good environment of the sewage treatment system, ensure stability, running speed, reliability test and the like of the control system, ensure high sewage discharge quality and reach discharge standards, and need to check and monitor several important technical processes such as water meter, water quality parameters, environmental parameters and the like in real time in the sewage treatment process. In practice, however, due to the lack of on-line measuring instruments and corresponding measuring sensors, or their need to work in extremely harsh environments, the purchase and maintenance costs are high, with the result that it may be difficult to achieve real-time measurements of some quality variables in an industrial process. The soft measurement technology can realize real-time monitoring and control of a dominant variable by establishing a mathematical model between an easily-measured variable (auxiliary variable) and a variable (dominant variable) which is difficult to directly measure in the process. The device has the advantages of convenient maintenance and low time delay, and is rapidly developed.

The deep learning has a multi-layer structure more complicated than the traditional model, has more information, better data extraction and stronger nonlinear characteristic expression capability, and can accurately map out the hidden complex mapping relation under the industrial data. By utilizing the advantage of deep learning and combining soft measurement, the data characteristic information can be fully extracted, and the prediction accuracy of the model is improved. The development of deep learning has been that various deep learning models are continuously developed and a plurality of new deep models are introduced, but a single soft measurement model has a plurality of problems such as local optimum and insufficient precision. In view of a large number of implicit data features contained in the data, various algorithms and network models are integrated, and the overall accuracy and generalization capability of the models are improved by combining respective advantages, so that the method has become a new research direction in soft measurement modeling of complex industrial processes.

Disclosure of Invention

The invention aims to: aiming at the problem that the water quality index biochemical oxygen demand of the sewage treatment process is difficult to realize on-line measurement, the soft measurement modeling method of the sewage treatment process based on integrated deep learning is provided, and compared with the traditional model, the soft measurement modeling method of the sewage treatment process based on integrated deep learning is better in stability and stronger in generalization capability, has theoretical research significance and has great practical application value.

The technical scheme is as follows: the invention discloses a soft measurement modeling method based on integrated deep learning sewage treatment process, which comprises the following steps:

s1, acquiring sewage data and performing data pretreatment;

s2, performing feature selection on the processed data by using KPCA, and selecting proper auxiliary variables so as to construct a soft measurement data sample set;

s3, taking the sewage data set processed in the S2 as the original input of a first-layer base learner in a Stacking integrated framework, wherein the first-layer base learner comprises a two-way long-short-term memory network BiLSTM, a limit gradient lifting XGBoost and a least square support vector machine LSSVM, and the prediction result of the first-layer base learner is obtained by carrying out 5-fold cross validation on each base learner;

s4, the second layer adopts ELM as a meta learner, and the result obtained from the first layer base learner is used as a training set of the second layer meta learner so as to complete training of the second layer meta learner;

s5, optimizing model parameters of the base learner by adopting an improved reptile search algorithm, wherein the improved reptile search algorithm comprises the following steps: the Latin hypercube initialization is adopted, in the iterative process, the non-linear method is adopted to improve the evolutionary meaning ES value, the golden sine and the flip strategy are utilized to improve the individual optimizing mode, and the optimized model is utilized to carry out soft measurement, so as to obtain the prediction result of the biochemical oxygen demand;

s6, error correction is carried out on the initial prediction result by adopting ELM, and a final prediction result is obtained.

Further, the sewage data in the step S1 includes suspended matter concentration SS, total nitrogen TN, ammonia nitrogen NH3-N, total phosphorus TP, chemical oxygen demand COD, and biochemical oxygen demand BOD at a historical time.

Further, in the step S2, the data is extracted by using KPCA, which specifically includes the steps of:

s3.1 let the training set sample data be x= (X) ₁ ,x ₂ ,...,x _m ) X is determined by a mapping function phi (x) _i Mapping to a high-dimensional feature space;

s3.2, calculating a covariance matrix C of the feature space:

s3.3, calculating a characteristic equation of the covariance matrix:

λ _i ξ _i ＝Cξ _i (3)

wherein ,λ_i Is the eigenvalue of covariance matrix C, ζ _i Is corresponding to the characteristic value lambda _i Is a feature vector of (1);

s3.4 defines a kernel matrix K:

K＝φ(x _i )·φ(x _i ) ^T (4)

s3.5, calculating a characteristic equation of the kernel matrix K:

wherein ,is the eigenvector of the kernel matrix K, α _i Is corresponding to the characteristic value +.>Is a feature vector of (1);

s3.6 substituting the covariance matrix C and the kernel matrix K into the characteristic equation of the kernel matrix K, and then substituting the characteristic vector xi of the covariance matrix C _i Can use non-linearityFunction phi (x) _i ) The expression is as follows:

wherein ,is xi _i A corresponding ith coefficient;

s3.7 calculating eigenvalues of the kernel matrix KThe eigenvalues are arranged in descending order

S3.8 sequentially calculating the contribution rate eta of the characteristic values _i And the cumulative contribution rate P is as follows:

s3.9, selecting the characteristic with the accumulated contribution rate P more than or equal to 85% as a main auxiliary variable input by the sewage soft measurement model.

Further, in the step S3, the step of establishing the two-way long-short term memory network prediction model is as follows:

s4.1, taking the determined auxiliary variable as an input vector x of the network;

s4.2 setting the forward hidden layer state asThe reverse hidden layer state is->w is differentThe output y calculation process of BiLSTM is:

s4.3 y _t As a result of the prediction of the model.

Further, in the step S3, the step of establishing the limit gradient lifting prediction model is as follows:

s5.1 let the training data set be T = { (x) ₁ ,y ₁ ),(x ₂ ,y ₂ ),...,(x _n ,y _n ) Loss function ofRegularization term Ω (f) _k ) The overall objective function can be written as:

where L (φ) is a representation in linear space, i is the ith sample, k is the kth tree,is the ith sample x _i Is a predicted value of (2);

s5.2, fitting the residual error of the predicted result of the last tree by using the predicted result of each tree:

s5.3, obtaining a predicted result of the t-th tree in the last step, wherein the predicted result is equal to the predicted result of the t-1 tree in front in value, and adding the expression of the t-th tree, and for the t-th tree, the objective function is as follows:

s5.4 approximating the original target by Taylor expansion with equation (14), define

Then formula (14) may be:

s5.5, obtaining an optimal solution of the objective function, wherein the optimal solution is as follows:

wherein I _j ＝{i|q(x _i ) =j } means that a certain sample is mapped to a node set;

s5.6, calculating a Gain value Gain, updating the maximum gain_max, updating the separation point, and finally obtaining the optimal separation point;

s5.7, repeating the process to recursively build the tree until the condition is terminated.

Further, in the step S3, the step of establishing a least squares support vector machine prediction model is as follows:

s6.1, optimizing a target, and defining a loss function as follows:

has the constraint condition:

where ω is weight, ζ _i Is an error variable, b is a deviation, c > 0 is a penalty coefficient;

s6.2 introducing a lagrange multiplier, formula (18) can be converted into:

wherein, lagrangian multiplier a _i ＞0(i＝1,2,...,N)；

S6.3 solving the optimal conditions

S6.4, combining the formulas to obtain an optimal regression function as follows:

wherein ,K(x_i ,y _j ) Is a kernel function, x _i Is the center of the kernel function, x is the input of the training sample, y _i Is the output of the training samples.

Further, in the step S4, the calculation formula of the prediction model of the meta learner is as follows:

where L is the number of hidden layer units, N is the number of training samples, β is the weight vector between the i-th hidden layer and the output layer, w is the weight vector between the input and output, g is the activation function, b is the bias vector, and x is the input vector.

Further, the steps of the improved reptile search algorithm in step S5 are as follows:

s5.1, using Latin hypercube sampling initialization to replace random initialization of an RSA algorithm, and setting search upper and lower bounds, population size and iteration times of IRSA;

s5.2, a surrounding phase, wherein crocodile individuals start to surround the prey, and a mathematical model is as follows:

η _ij ＝B _j (t)×P _ij (24)

wherein ,B_j (t) represents the location of the optimal solution; s is S _i,j (t+1) represents the next update position; t is the current iteration number; t (T) _max Is the maximum number of iterations; η (eta) _ij Representing a hunting operator; r is R _ij Is a reduction function for reducing the search space; alpha and beta are sensitive parameters, and the search precision is controlled; r is (r) ₁ ，r ₂ Are all [1, N]A random number within; r is (r) ₃ Is [ -1,1]Random integers in (a); ES (t) is evolutionary; p (P) _ij Representing a percentage difference between the optimal solution position and the current solution position; m(s) _i ) Representing the average position of the ith solution;

s5.3, improving the evolution meaning parameter of the formula (26), wherein the improved evolution meaning expression is as follows:

s5.4 hunting stage, the crocodile individuals start hunting, and the mathematical model is as follows:

s5.5, introducing a golden sine and a flip bucket strategy to a position updating formula (30), wherein the improved formula is as follows:

wherein ,γ₁ ，γ ₂ Respectively [0,2 pi ]]And [0, pi ]]A random number within; gamma ray ₃ ，γ ₄ Is [0,1 ]]A random number within; f=2 is the void fraction, defining the position relative to the prey; x is x ₁ ，x ₂ Is the golden sine coefficient x ₁ and x₂ The calculation formula of (2) is as follows:

x ₁ ＝a*(1-γ)+b*σ (32)

x ₂ ＝a*γ+b*(1-σ) (33)

wherein a and b are golden section ratio search initial values,is the golden ratio.

Further, in the step S5, optimizing parameters of the model using the modified reptile search algorithm includes: the learning rate and hidden layer node number of the two-way long-short-term memory network BiLSTM and the limit gradient promote the weight and learning rate of XGBoost, and the least square support vector machine LSSVM has optimal penalty coefficient and kernel function width value.

Further, in the step S6, the error correction step is performed by using ELM as follows:

s6.1, subtracting an initial predicted value obtained by the integrated model from an original observed value to construct an error sequence;

s6.2, predicting an error sequence by using an ELM network;

s6.3, the initial prediction sequence and the error prediction sequence are linearly added to obtain a final prediction result.

The beneficial effects are that:

the depth and integrated learning method based on the invention gives consideration to the training principle difference of different algorithms, and fully plays the advantages of each model in the prediction process. The stronger the learning ability of the base learner, the smaller the degree of correlation between each other, and the better the final prediction effect. Aiming at the problem that the model hyper-parameters are difficult to determine, a reptile search algorithm is introduced to optimize the model, and the optimal model parameters are selected. And the algorithm improvement is proposed to improve the optimizing capability of the algorithm, and the improvement is as follows: firstly, aiming at the problem that the random initialization of an algorithm can not uniformly distribute the population in the whole optimizing space, latin hypercube initialization is introduced to ensure that the initial population uniformly covers the whole distributing space; secondly, in the iterative process, the random decreasing strategy of the ES value with the evolutionary meaning between-2 and 2 can not completely explain the actual convergence optimization process, so that the nonlinear strategy is adopted for improvement, the algorithm can more effectively balance the global and local searching capability, and the convergence precision of the algorithm is improved. And finally, an individual optimizing mode is improved by utilizing a golden sine and turning strategy, so that a common individual exchanges information with an optimal individual in each iteration, the position difference information between the common individual and the optimal individual is thoroughly absorbed, and the algorithm searching performance and the searching accuracy are improved. Compared with the traditional single model prediction, the sewage treatment process soft measurement modeling method based on depth and integrated learning has higher precision and generalization capability.

Drawings

FIG. 1 is a schematic diagram of a multi-model training framework of a sewage treatment process soft measurement modeling method based on integrated deep learning;

FIG. 2 is a flow chart of algorithm optimization model parameters in the integrated deep learning-based sewage treatment process soft measurement modeling method provided by the invention;

fig. 3 is a flowchart of the sewage treatment process soft measurement modeling method based on integrated deep learning.

Detailed Description

Embodiments of the present invention will be further described with reference to the accompanying drawings.

As shown in fig. 1, the invention provides a sewage treatment process soft measurement modeling method based on integrated deep learning, which comprises the following steps:

s1, acquiring sewage data from an international standard BSM 1 simulation platform and performing data preprocessing.

S1.1, obtaining sewage data from an international standard simulation platform, wherein the sewage data comprises ammonia nitrogen NH3-N, suspended matter concentration SS, chemical oxygen demand COD, total nitrogen TN, total phosphorus TP and biochemical oxygen demand BOD at historical time.

S1.2, preprocessing the acquired sewage data to normalize the data, wherein the formula is as follows:

in the formula ,S^* Represents the normalized data, S represents the original data, S _max and S_min Representing the maximum and minimum values in the original data, respectively.

S2, performing feature selection on the processed data by using KPCA, and selecting the most suitable auxiliary variable, thereby constructing a soft measurement data sample set.

S2.1 let the training set sample data be x= (X) ₁ ,x ₂ ,...,x _m ) X is determined by a mapping function phi (x) _i Mapped to a high-dimensional feature space.

S2.2, calculating a covariance matrix C of the feature space:

s2.3, calculating a characteristic equation of the covariance matrix:

λ _i ξ _i ＝Cξ _i (3)

in the formula ,λ_i Is the eigenvalue of covariance matrix C, ζ _i Is corresponding to the characteristic value lambda _i Is described.

S2.4 defines a kernel matrix K:

K＝φ(x _i )·φ(x _i ) ^T (4)

s2.5, calculating a characteristic equation of the kernel matrix K:

in the formula ,is the eigenvector of the kernel matrix K, α _i Is corresponding to the characteristic value +.>Is described.

S2.6 substituting the covariance matrix C and the kernel matrix K into the characteristic equation of the kernel matrix K, and then substituting the characteristic vector xi of the covariance matrix C _i Can use nonlinear function phi (x _i ) The expression is as follows:

in the formula ,is xi _i The corresponding i-th coefficient.

S2.7 calculating eigenvalues of the kernel matrix KThe eigenvalues are arranged in descending order

S2.8 sequentially calculating the contribution rate eta of the characteristic values _i And the cumulative contribution rate P is as follows:

s2.9, selecting the characteristic with the accumulated contribution rate P more than or equal to 85% as a main auxiliary variable input by the sewage soft measurement model.

S3, taking the sewage data set processed in the S2 as the original input of a first-layer base learner in a Stacking integrated framework, wherein the first-layer base learner comprises a two-way long-short-term memory network BiLSTM, a limit gradient lifting XGBoost and a least square support vector machine LSSVM, and the prediction result of the first-layer base learner is obtained by carrying out 5-fold cross validation on each base learner.

S3.1, establishing a two-way long-short term memory network prediction model.

S3.1.1 takes the determined auxiliary variable as the input vector x of the network.

S3.1.2 set the forward hidden layer state asThe reverse hidden layer state is->w is a different weight matrix, and the output y of BiLSTM is calculated by the following steps:

s3.1.3 y is _t As a result of the prediction of the model.

S3.2, establishing a limit gradient lifting prediction model.

S3.2.1 training dataset is t= { (x) ₁ ,y ₁ ),(x ₂ ,y ₂ ),...,(x _n ,y _n ) Loss function ofRegularization term Ω (f) _k ) The overall objective function can be written as:

where L (φ) is a linear spatial representation, i is the ith sample, k is the kth tree,is the ith sample x _i Is a predicted value of (a).

S3.2.2 the prediction result of each tree is utilized to fit the residual error of the prediction result of the last tree, so that the overall tree model effect is better and better;

s3.2.3 the predicted result of the t-th tree is obtained in a step and is equal to the predicted result of the t-1 tree in front in value, and the expression of the t-th tree is added. For the t-th tree, the objective function is:

s3.2.4 equation (14) is developed by Taylor to approximate the original target, defining

Then formula (14) may be:

s3.2.5 the optimal solution for the objective function is found as:

in the formula I _j ＝{i|q(x _i ) =j } means that a certain sample is mapped to a set of nodes.

S3.2.6 calculates Gain value Gain, updates maximum gain_max, and updates the separation point to obtain the optimal separation point.

S3.2.7 repeating the above process recursively builds a tree until the condition is terminated.

S3.3, establishing a least square support vector machine prediction model.

S3.3.1 optimization objective, define the loss function as:

has the constraint condition:

wherein ω is a weight, ζ _i Is the error variable, b is the bias, c > 0 is the penalty coefficient.

S3.3.2 introducing a lagrange multiplier, equation (18) can be converted to:

in the Lagrangian multiplier a _i ＞0(i＝1,2,...,N)。

S3.3.3 solving for optimal conditions

S3.3.4 the optimum regression function obtained by integrating the above formula is as follows:

in the formula ,K(x_i ,y _j ) Is a kernel function, x _i Is the center of the kernel function, x is the input of the training sample, y _i Is the output of the training samples.

S4, using the result obtained from the first layer base learner as a training set of the second layer element learner to complete training of the second layer element learner, wherein a prediction model of the element learner is ELM, and a calculation formula is as follows:

S5, optimizing model parameters of the base learner by adopting an improved reptile search algorithm, and performing soft measurement by utilizing an optimized model to obtain a prediction result of the biochemical oxygen demand.

S5.1, using Latin hypercube sampling initialization to replace random initialization of RSA algorithm, setting search upper and lower bounds, population size and iteration times of IRSA.

The Latin hypercube sampling initialization method comprises the following steps:

s5.1.1 determines the population size a and the dimension D.

S5.1.2 the interval of variable A is [ low, up ], up and low being the upper and lower bounds of variable A, respectively.

S5.1.3 the interval of variable a is divided into N equal subintervals.

S5.1.4 randomly selects a point from each subinterval of each dimension.

S5.1.5 combine each of the selected points to form an initial population.

η _ij ＝B _j (t)×P _ij (24)

in the formula ,B_j (t) represents the location of the optimal solution; s is S _i,j (t+1) represents the next update position; t is the current iteration number; t (T) _max Is the maximum number of iterations; η (eta) _ij Representing a hunting operator; r is R _ij Is a reduction function for reducing search spaceA compartment; alpha and beta are sensitive parameters, and the search precision is controlled; r is (r) ₁ ，r ₂ Are all [1, N]A random number within; r is (r) ₃ Is [ -1,1]Random integers in (a); ES (t) is evolutionary; p (P) _ij Representing a percentage difference between the optimal solution position and the current solution position; m(s) _i ) Representing the average position of the ith solution.

in the formula ,γ₁ ，γ ₂ Respectively [0,2 pi ]]And [0, pi ]]A random number within; gamma ray ₃ ，γ ₄ Is [0,1 ]]A random number within; f=2 is the void fraction, defining the position relative to the prey; x is x ₁ ，x ₂ Is the golden sine coefficient x ₁ and x₂ The calculation formula of (2) is as follows:

x ₁ ＝a*(1-γ)+b*σ (32)

x ₂ ＝a*γ+b*(1-σ) (33)

S5.6, optimizing the learning rate and hidden layer node number, the weight and learning rate of XGBoost, the optimal penalty coefficient and kernel function width value of the LSSVM by using the improved reptile search algorithm.

S6.1, subtracting the initial predicted value obtained by the integrated model from the original observed value to construct an error sequence.

S6.2 predicts the error sequence using the ELM network.

S7, constructing a sewage treatment soft measurement platform based on QT, python and MATLAB, wherein the sewage treatment soft measurement platform comprises a user login interface, a sewage data monitoring module and an online prediction module, and the aim of soft measurement system visualization is achieved.

The invention also realizes the BOD soft measurement intelligent system, which comprises a data acquisition module, a data processing module, a model training module, a parameter optimization module, an error correction module and an on-line monitoring module.

And a data acquisition module: is used for acquiring sewage data, including ammonia nitrogen NH3-N, suspended matter concentration SS, chemical oxygen demand COD, total nitrogen TN, total phosphorus TP and biochemical oxygen demand BOD at historical time.

And a data processing module: and extracting the characteristics of the collected sewage data by using a KPCA method, selecting the characteristics with high correlation, and screening auxiliary variables most suitable for model input.

Model training module: an integration model based on a Stacking method is established, and the method integrates the Stacking with BiLSTM, XGBoost, LSSVM, so that the effect of the model is improved while the overfitting is relieved.

Parameter optimization module: and (3) optimizing parameters of the model by using an IRSA algorithm, wherein the parameters comprise the learning rate and the hidden layer node number of BiLSTM, the weight and the learning rate of XGBoost, the optimal penalty coefficient and the kernel function width value of the LSSVM.

Error correction module: error sequence-based correction of the primary prediction results is performed using ELM.

And an online monitoring module: the system comprises a user login interface, a sewage data monitoring and online prediction module, and realizes the aim of the visualization of the soft measurement system.

The present invention is not limited to the above embodiments, and any simple modification, equivalent variation and modification made to the above embodiments according to the technical substance of the present invention falls within the scope of the technical solution of the present invention.

Claims

1. The sewage treatment process soft measurement modeling method based on integrated deep learning is characterized by comprising the following steps of:

s1, acquiring sewage data and performing data pretreatment;

s2, performing feature selection on the processed data by using KPCA, and selecting proper auxiliary variables so as to construct a soft measurement data sample set; the data is extracted by using KPCA, and the specific steps are as follows:

s3.2, calculating a covariance matrix C of the feature space:

s3.3, calculating a characteristic equation of the covariance matrix:

λ _i ξ _i ＝Cξ _i (3)

s3.4 defines a kernel matrix K:

K＝φ(x _i )·φ(x _i ) ^T (4)

s3.5, calculating a characteristic equation of the kernel matrix K:

s3.6 substituting the covariance matrix C and the kernel matrix K into the characteristic equation of the kernel matrix K, and then substituting the characteristic vector xi of the covariance matrix C _i Can use nonlinear function phi (x _i ) The expression is as follows:

wherein ,is xi _i A corresponding ith coefficient;

s3.7 calculating eigenvalues of the kernel matrix KThe characteristic values are arranged in descending order +.>

s3.9, selecting the characteristic with the accumulated contribution rate P more than or equal to 85% as a main auxiliary variable input by the sewage soft measurement model;

2. The integrated deep learning-based sewage treatment process soft measurement modeling method according to claim 1, wherein the sewage data in the step S1 includes suspended matter concentration SS, total nitrogen TN, ammonia nitrogen NH3-N, total phosphorus TP, chemical oxygen demand COD, and biochemical oxygen demand BOD at historical time.

3. The method for modeling soft measurement of sewage treatment process based on integrated deep learning according to claim 1, wherein in the step S3, the step of establishing a two-way long-short term memory network prediction model is as follows:

s4.2 setting the forward hidden layer state asThe reverse hidden layer state is->w is a different weight matrix, and the output y of BiLSTM is calculated by the following steps:

s4.3 y _t As a result of the prediction of the model.

4. The method for modeling soft measurement of sewage treatment process based on integrated deep learning according to claim 1, wherein in the step S3, the step of establishing a limit gradient lifting prediction model is as follows:

……

Then formula (14) may be:

5. The method for modeling soft measurement of sewage treatment process based on integrated deep learning according to claim 1, wherein in the step S3, the step of establishing a least squares support vector machine prediction model is as follows:

s6.1, optimizing a target, and defining a loss function as follows:

has the constraint condition:

s6.2 introducing a lagrange multiplier, formula (18) can be converted into:

wherein, lagrangian multiplier a _i ＞0(i＝1,2,...,N)；

S6.3 solving the optimal conditions

6. The integrated deep learning-based sewage treatment process soft measurement modeling method according to claim 1, wherein the meta-learner prediction model calculation formula in step S4 is as follows:

7. The integrated deep learning based sewage treatment process soft measurement modeling method according to claim 1, wherein the step of the improved reptile search algorithm in step S5 is as follows:

η _ij ＝B _j (t)×P _ij (24)

x ₁ ＝a*(1-γ)+b*σ (32)

x ₂ ＝a*γ+b*(1-σ) (33)

8. The integrated deep learning based sewage treatment process soft measurement modeling method according to claim 7, wherein in the step S5, optimizing the parameters of the model using the modified reptile search algorithm comprises: the learning rate and hidden layer node number of the two-way long-short-term memory network BiLSTM and the limit gradient promote the weight and learning rate of XGBoost, and the least square support vector machine LSSVM has optimal penalty coefficient and kernel function width value.

9. The method for modeling soft measurement of a sewage treatment process based on integrated deep learning according to any one of claims 1 to 8, wherein in the step S6, the error correction step using ELM is as follows:

s6.2, predicting an error sequence by using an ELM network;