CN105550492A

CN105550492A - System and method for predicting parameter of wastewater treatment process

Info

Publication number: CN105550492A
Application number: CN201510783321.8A
Authority: CN
Inventors: 倪网东; 刘建林; 麦燕萍; 张伟建; 黄文星; 伍文桢; 陈询吉; 陈韬
Original assignee: Sembcorp Industries Ltd
Current assignee: Shengke Water Treatment Technology Co ltd
Priority date: 2014-10-23
Filing date: 2015-10-23
Publication date: 2016-05-04
Anticipated expiration: 2035-10-23
Also published as: CN105550492B; SG10201406850VA

Abstract

The invention describes a method for predicting a parameter of wastewater treatment process. The method comprises the following steps: obtaining a data set comprising a plurality of process variables related to the parameter of the wastewater treatment process; obtaining a predetermined number of measurement values of the parameter to be predicted; preprocessing the obtained data set, wherein preprocessing steps comprises classifying the data set into input groups which comprise the data set and the measurement values of the parameters to be predicted; obtaining a synthesized output at each soft sensor; averaging the synthesized outputs of each soft sensor crossing the soft sensors; and setting the synthesized output after the averaging as a final prediction parameter.

Description

System and method for predicting parameters of a wastewater treatment process

Technical Field

The invention relates to a system and method for predicting parameters of a wastewater treatment process. In particular, the system and method are suitable for, but not limited to, prediction of effluent parameters during Expanded Granular Sludge Bed (EGSB) and will be described below.

Background

Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

Moreover, throughout this specification, unless the context requires otherwise, the word "comprise", such variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

The anaerobic wastewater treatment process is an industrial process that utilizes microorganisms to degrade pollutants in wastewater in an anaerobic environment. Typical variations of anaerobic wastewater treatment processes used in industry include anaerobic filter tanks, down-flow static fixed-film reactors, fluidized bed anaerobic reactors, and up-flow anaerobic sludge-filled reactors (UASB). Expanded Granular Sludge Bed (EGSB) reactors are a particular application of USAB with high wastewater treatment rates. During EGSB, wastewater flows through the reactor at a high upflow velocity.

Due to the complexity of anaerobic wastewater treatment processes that are highly sensitive to organic load disturbances and dynamic changes in the concentration of substances in influent wastewater, it is appropriate to ensure proper management of biomass power generation. Typically, biomass management is reactively complete (i.e., after a sudden change in concentration/perturbation), and this is undesirable because, among other problems, the biomass takes longer to accommodate the influent wastewater. In the case of low biomass inventory due to failure to react in time, process upsets leading to significant downtime and loss of biomass can occur. Such downtime may further be accompanied by an associated economic loss, loss of time and resources due to the inability to deliver. Thus, there is a need to transition wastewater treatment processes from "reactive" management to "predictive" management.

For the reasons described above, one way of predictive management is to use mathematical models to predict the key compositional qualities of the treated wastewater. This method reduces any impediment to efficient operation of the wastewater treatment process and facilitates biomass inventory management. However, due to its multivalent and nonlinear nature, susceptibility to random noise and systematic errors due to the use of measurement equipment, dynamic nature of process information, and time-delayability, finding an appropriate mathematical model to accurately predict anaerobic wastewater treatment performance is a challenge. Furthermore, current mathematical models (e.g., mechanical models) that incorporate fundamental theory are inadequate or inefficient at modeling the dynamic complexity of anaerobic reaction systems. It is readily appreciated that the required computation time also increases exponentially due to the increase in system complexity. This makes the mechanical model unsuitable for situations where "near real-time" type predictions or reasonable turn-around times are expected.

Another approach is to use soft sensors, and in particular statistical, data-driven soft sensors, built from recorded process history data to predict an effluent parameter such as Total Organic Carbon (TOC). When such soft sensors are used to predict certain wastewater treatment processes and monitor effluent (treated water) parameters, further improvements are needed to improve such soft sensors in order to reduce computational requirements and resources while improving prediction accuracy.

Due to the complexity of these USAB or EGSB processes and the excessive number of parameters to be considered, the use of soft sensors for USAB or EGSB processes to predict effluent parameters such as TOC has not emerged. Furthermore, there is a need to improve the accuracy and efficiency of such soft sensor prediction techniques by providing a reliable assessment of the aforementioned effluent parameters to estimate the performance of the EGSB reactor at reasonable, if not near real-time, turn-around times.

It is therefore an object of the present invention to meet the above need and at least partly reduce the challenges.

Disclosure of Invention

The present invention includes the development of soft sensors to predict and monitor the operation of wastewater treatment operations, particularly in the field of Expanded Granular Sludge Bed (EGSB) reactors, but is not limited to such reactors in the anaerobic treatment of industrial wastewater. In one embodiment, a partial least squares based soft sensor is used to predict TOC as a performance indicator to monitor EGSB reactors in a wastewater treatment plant (WWTP).

According to an aspect of the invention, there is a method for predicting a parameter of a wastewater treatment process, comprising the steps of: - (a) obtaining a data set comprising a plurality of process variables related to parameters of the wastewater treatment process to be predicted; (b) obtaining a predetermined number of measurements of a parameter to be predicted; (c) pre-processing the obtained data set; the pre-processing step comprises aligning (align) the process variables to account for any time-varying nature of the process variables and classifying the data sets into input groups, the input groups comprising the data sets and measured values of the parameters to be predicted; (d) obtaining a composite output at each soft sensor; and (e) averaging the resultant output across each of the plurality of soft sensors; the averaged composite output is set as the final prediction parameters.

Preferably, the wastewater treatment process comprises an Expanded Granular Sludge Bed (EGSB) reactor process and the parameter to be predicted is the Total Organic Carbon (TOC) of the effluent of the EGSB reactor.

Preferably, the step of pre-treating comprises: in the event that a void is detected in the data set, a sexual interpolation or other interpolation such as polynomials and splines is applied to the data set.

Preferably, the classification of the data set comprises: the data set is expressed in an autoregressive with exogenous input (ARX) structure.

Preferably, the measured value of the parameter to be predicted is passed through a noise reduction filter prior to step (d). The noise reduction filter may preferably comprise a Savitzky-Golay filter and/or a Kalman filter.

Preferably, four soft sensors are included. In such an instance, the output of each of the four sensors is preferably further processed to account for inaccuracies arising from the dynamic nature of the wastewater treatment process prior to step (e).

Preferably, the soft sensor comprises an artificial neural network, a support vector machine, and a gaussian process regression.

According to a second aspect of the invention, there is a system for predicting a parameter of a wastewater treatment process, comprising a first set of measurement devices arranged to obtain a data set, the data set comprising a plurality of process variables related to the parameter of the wastewater treatment process to be predicted; a second set of measurement devices arranged to obtain a predetermined number of measurements of the parameter to be predicted; a processor arranged to: - (i) receiving data sets from the first and second sets of measurement devices and a predetermined number of measurements; the processor is operable to pre-process the acquired data sets and classify the data sets into input groups; the input set further comprises a data set and measured values of the parameters to be predicted; (ii) obtaining a composite output of each soft sensor; (iii) averaging the composite output across each of the plurality of soft sensors; the averaged composite output is set as the final prediction parameters.

Preferably, the wastewater treatment process comprises an Expanded Granular Sludge Bed (EGSB) reactor and the parameter to be predicted is an effluent parameter.

Preferably, the effluent parameters include Total Organic Carbon (TOC), Chemical Oxygen Demand (COD), and Biochemical Oxygen Demand (BOD) of the EGSB reactor.

Preferably, the EGSB reactor comprises at least one equalization tank and at least one inflow regulation tank.

Preferably, the first set of measurement devices comprises both online and offline measurements.

Preferably, the second set of measurement devices comprises off-line measurements.

Preferably, the process variables include: the level of wastewater in at least one equalization tank; flow rate, inflow and outflow temperatures between the balancing tank and the inflow regulating tank; the pH value of the wastewater at the balance tank and the regulating tank.

Drawings

For the present invention to be more readily understood and effectively practiced, reference will now be made to the accompanying drawings, which illustrate preferred embodiments of the present invention, and in which:

FIG. 1 shows an Expanded Granular Sludge Bed (EGSB) reactor process;

FIG. 2 is a flow diagram of a method for predicting parameters of a wastewater treatment process according to an embodiment of the invention;

FIG. 3 is a flow chart of an algorithm employed for developing soft sensors for use in predicting parameters of a wastewater treatment process; and is

Fig. 4 shows the prediction result according to the method for predicting the parameter of the wastewater treatment process.

Other devices may be employed with the present invention and, accordingly, the drawings are not to be understood as superseding the generality of the description of the invention.

Detailed Description

In the specification, it will be appreciated that the term "effluent" refers to treated wastewater from an anaerobic/EGSB reactor. The treated wastewater may or may not reach the effluent quality, i.e., the treated wastewater from the anaerobic/EGSB reactor may require further treatment to reach sufficient quality.

According to an embodiment of the present invention, there is a method 200 for predicting a parameter of a wastewater treatment process (WWTP). In particular, the method is applicable to, but not limited to, EGSB plants/reactors, and the parameter to be predicted is the Total Organic Carbon (TOC) in the effluent of the EGSB reactor.

As shown and with reference to fig. 1, a typical EGSB10 system and process includes an influent conditioning tank 20 and an EGSB reactor 30. In the EGSB reactor 30, the pelleted microorganisms are contacted with influent wastewater from the influent regulating reservoir 20 at a relatively high flow rate. The wastewater is continuously treated until the wastewater is sufficiently treated, and then they are discharged as treated water 40. The microbial consortium in the EGSB process degrades organic contaminants in the wastewater to form additional biomass (microbes), treated water 40, and a mixture of methane and carbon dioxide (CO)₂) Energy in the form of biogas 50. Various parameters of the EGSB reactor 30The numbers (as shown in table 1) may be used to predict the outflow parameters.

The predicted TOC may be used, inter alia, to monitor the process performance of an EGSB reactor.

Referring to fig. 2, the method 200 includes the steps of:

(a.) obtaining a data set (step 202), the data set comprising a plurality of process variables related to the TOC in the effluent of the EGSB reactor. The plurality of process variables includes, but is not limited to: -TOC in the influent; the pH value; a flow rate; temperature, etc. In an embodiment, 22 different process variables are selected. The 22 process variables are based on an EGSB reactor having one or more equalization tanks and one or more influent conditioning tanks 20. The reactor also includes a plurality of heat exchangers operable to cool the reactor during operation.

The 22 parameters are listed in table 1 below:

TABLE 1 Process variables Table

Note: in the above table the waste water is referred to as "ww".

The process variables may be obtained from historical samples or data in the operation of the EGSB plant.

(b.) in addition to obtaining the data set of process variables, an offline (non-real time) measurement of effluent TOC corresponding to 22 process variables is obtained as initial TOC data (step 210).

(c.) the obtained data set is next preprocessed. The pretreatment comprises the following steps: -

i. Alignment of process variables (step 204). Since the data set of the process variable may have a different sampling frequency in seconds, minutes, hours, or days, to obtain the time-varying characteristic, the time base reference of the process variable is predetermined to a value, for example 24 hours, and the hourly data of the 22 process variables of the previous 24-hour data is used to match the initial value of the effluent TOC for each day. As a result, the dimension of the input variable is 528(22 times 24).

Missing value processing step (step 206). It is to be appreciated that during wastewater treatment, the physical sensors (7 days on-line for 24 hours) may not be able to obtain all of the required data. In view of this, different online sensors may not be able to collect the required data for different time periods. An missing value processing step 206 is added to process such missing gaps in the data acquisition, and the gaps may be processed by interpolation methods. For example, the interpolation method includes linear interpolation, but may include other interpolation methods such as polynomial interpolation and spline interpolation, or may implement other missing value processing methods to process missing values.

Classifying the input process variable and the initial output process parameter TOC (step 208). The input process variables are further arranged into an autoregressive exogenous (ARX) structure to account for large delays and large time varying characteristics. This is to account for dynamic process variations such as severe random disturbances, large delays, and large time varying characteristics to improve the reliability of the predictive performance.

To explain the time variation and time delay characteristics, the input vector X of the ARX structure is mathematically expressed in equation (1) as follows.

x＝[y(t-1)，...，y(t-L)，u(t)](1)

Where the past L output values and past L control vectors are observed. The number of levels in the ARX model structure may be optimized with the Chikuchi information criterion (AIC) to obtain optimized model performance for the dynamic model. According to a characteristic of the process, u (t) is a process variable with 528 dimensions set at time k, and y (t-1),.., y (t-10) is the L-10 TOC value before time t.

Signal processing is performed on the TOC measurements to remove noise from the data (step 212). Noisy offline measurements of TOC values produced from observed output data and experimental measurements obtained by the process can create difficulties in effectively acquiring trends in process dynamics. Filtering noise in the measurements of TOC is beneficial to improve the predictive performance of the developed soft sensor. In particular, filtering or smoothing out noise improves the accuracy of the dynamic model and allows for a clearer view of the inherent trends of the process.

Two filters are used to filter the noise, a Savitzky-Golay ("Sav-Gol") filter and a Kalman filter (but not limited thereto, other noise filter methods such as fourier transform, wavelet analysis, or wiener filter may also be implemented to filter the noise).

The Sav-Gol filter is operable to smooth data and increase the signal-to-noise ratio without significantly distorting the original data. In a sense, the Savitzky-Golay filter is based on a local polynomial regression method, which can be mathematically interpreted as having the value y_iFor each point j, a weighted sum (linear combination) of the neighboring values is calculated, and the number of neighboring values and the degree of the polynomial control the strength of the smoothing, as follows (step 214):

{\hat{y}}_{j} = \frac{1}{N} Σ_{h = - k}^{k} c_{h} y_{j + h} - - - (2)

wherein,is a smoothed or processed new "noise reduced" TOC value (with a smooth curve or derivative), N is a normalization constant, k is the number of adjacent values at each side of j, so the window size is 2k +1, and c_hIs a coefficient (smooth, first or second derivative) that depends on the degree of the polynomial used and the target. A window size of 3, a polynomial degree of 2 and a derivative of 0 are used when applied.

The Kalman filter initially described a recursive solution to the discrete data linear filtering problem as an alternative, given by a linear random difference equation with measured values, as follows (step 216):

x(k+1)＝Ax(k)+Bu(k)+w(k)(3)

y(k)＝Cx(k)+v(k)(4)

where w and v are uncorrelated, normally distributed white noise processed with covariance matrices Q and R, respectively, and S is the covariance between w and v. The Kalman filter can be divided into two distinct phases: prediction and update (filter), as follows:

\hat{x} (k + 1 | k) = A x (k | k) + B u (k) - - - (5)

P(k+1|k)＝AP(k|k)A^T+Q(k)(6)

\hat{x} (k + 1 | k + 1) = \hat{x} (k + 1 | k) + Kg (k + 1) (y (k) - C \hat{x} (k + 1 | k)) - - - (7)

P(k+1|k+1)＝(I-Kg(k+1)C)P(k+1|k)(8)

Kg(k+1)＝P(k+1|k)C^T(CP(k+1|k)C^T+R(k))^-1(9)

where Kg and P are the covariance matrices of Kalman gain and state estimation error, respectively. Prediction equations 5 and 6 are responsible for projecting forward (in time) the current state and error covariance estimates to obtain the priority estimate for the next time step. The filter equations 7 and 8 are responsible for the feedback-i.e. for incorporating the new measured values into the prior estimated values to obtain improved a posteriori estimated values. This is a one step ahead filter corrected by the measurement. A simple linear Kalman filter may be used to filter the noise process data, where a equals 1, B equals 0, C equals I, P (0|0) ═ 1, and x (0|0) ═ 0.

(d.) actual algorithms for processing input and measured TOC values (i.e. soft sensors) are based on Partial Least Squares (PLS) algorithms and stacked PLS (spls) algorithms, control equations mathematically expressed in equations 10-18. The input is a vector comprising a process variable and a measured TOC value; the output includes the initially measured outgoing TOC value.

The general form of the PLS algorithm is described as follows: -

Let x be (x)₁，...，x_n)^T∈R^n×mAnd Y (Y)₁，...，y_n)^T∈R^n×pAre input and output matrices, which may be linearly related as follows:

Y＝XB+E(10)

where B is the matrix of regression coefficients and E is the residual matrix. Instead of finding the relationship directly, both X and Y are modeled by linear latent variables according to the PLS regression model, i.e. the relationship is found directly

X＝TP^T+E_X(11)

And

Y＝UQ^T+E_Y(12)

with error matrix E_XAnd E_YWhere the matrices T and U (score matrix) and P and Q (load matrix) have columns and a ≦ min (m, Q, n) for the number of PLS components. The x-and y-scores are then linked by an internal linear regression,

u₁＝b₁t₁+h₁(13)

wherein h is₁And b₁Respectively, residual and regression vectors, and t₁And u₁Respectively, the first PLS score component. Thus, E shown in equations (14) and (15) below can be based on the same procedure as producing the first component_X，1And E_Y，1To generate second PLS components.

E_X，1＝X-t₁P₁ ^T(14)

E_Y，1＝Y-b₁t₁q₁ ^T(15)

After repeating "a" times, where "a" ranges from 1-25 (but is not limited to this range it will be appreciated that the range used for the PLS model here depends on the number of variables (process variables) in the input data, as determined by cross-validation, one can generate "a" PLS components and in each repetition a score vector t_j(j 1.. a.) is normalized, unlike w_jAnd p_jWherein w is_jIs the jth weight vector. Thus, the PLS algorithm may be provided in the form of,

{X, Y} \overset{P L S}{&RightArrow;} {T, W, P, B, Q} - - - (16)

t, W, P, B, Q are scores, weights, loads, regression coefficients, and load matrices. Once the regression matrix is generated, new outputs may be predicted based on the new inputs by equation (11). A prediction of a new sample can be obtained as

\hat{Y} = X_{n e w} B - - - (17)

A PLS model. Due to the ARX structure of the input and the output filtered by the two filters, the PLS soft sensors may be referred to as "ARX-PLS-SG" and "ARX-PLS-K", respectively, for ease of reference to the drawings and the description. The predicted TOC may be generated by both soft sensors when new inputs are available.

The stacked PLS algorithm (SPLS) can be used instead of PLS or in combination with PLS to model the EGSB process. The SPLS algorithm was developed by Ni et al in 2009 and was previously applied in different areas of spectral correction and prediction.

In view of the fact that the historical data used to establish the ARX-PLS mode contains an X matrix comprising 1000 samples and 538 variables, 10 of which are the measured outgoing TOC values, not all local variables contain the same information. Some are more information and others may be redundant data. However, redundant variables in the data set may impair or degrade the predictive performance of the PLS model. Therefore, to mitigate this effect, the correction set of input X is divided into n disjoint intervals X of the same width_kAnd in the target attribute vector y and n intervals X_kBetween each of which n PLS models are developed. The n PLS interval models are then weighted according to their predicted performance using cross-validation (see equation 18).

\hat{W} = A R G m i n (y - Σ_{k - 1}^{n} w_{k} {\hat{y}}_{k}) - - - (18)

Referring to fig. 3, after the ARX structure is laid out, the input has 538 dimensions (including ten measured outflowing TOCs as the starting output). SPLS was developed for efficient extraction of useful information in the input by using weighting (see equation 18). The 538 variables are divided into n (2. ltoreq. n.ltoreq.20) contiguous regions. If n is 4, 538 variables are divided into 4 local regions. In each local area, the PLS model is built based on equations 10-17. The 4 local PLS models are fused by weighting according to equation 18 (where n-4). The weighting implies the importance of the corresponding local model. Since n varies from 2 to 20, there are 19 SPLS models. The SPLS model with the best performance is selected for prediction. Like the PLS soft sensors, in the present invention, the SPLS soft sensors can be referred to as ARX-SPLS-SG and ARX-SPLS-K.

The pre-processed input is iterated in each of the labeled algorithms "ARX-PLS-SG" (step 222), "ARX-SPLS-SG" (step 224), "ARX-PLS-K" (step 226), and "ARX-SPLS-K" (step 228). Each of the four algorithms will produce a corresponding predicted TOC value.

To account for the uncertainty in the reliability of wastewater treatment systems and processes, which are essentially dynamic processes that vary over time, it is often possible that a dynamic model, such as PLS with ARX, can model a process with unacceptable accuracy. The dynamic model must therefore be reliable and strongly adaptive. For this reason, the bias of the dynamic model can be updated by incorporating an updated scheme to correct the predicted output of the model by adding the bias smoother bias (k) defined in equation (19) (s.mu et al 2006 and f.ahmed et al 2009), equation (19) for the current bias defined in equation (20)₀(k) And the entire offset at the last point bias (k-1) is weighted and summed, with the weighting factor ω varying between 0 and 1, to improve the prediction accuracy of the dynamic model. Note that the initial value of bias (0) is 0.

bias(k)＝ω×bias₀(k)+(1-ω)×bias(k-1)(19)

bias₀(k)＝y_obs(k-1)-y_mod(k-1)(20)

y_obsAnd y_modRespectively the observed target values (from an off-line laboratory) and the predicted values from the dynamic model (i.e. PLS). Then final output y of the dynamic model_cor(k) Can be modified as follows: -

y_cor(k)＝y_mod(k)+bias(k)(21)

To make a prediction from each soft sensor, a bias update is applied to obtain an updated prediction result by using equations (19) to (21) and ω ═ 0.7. Steps 232, 234, 236, 238 correspond to the application of the bias update of the respective soft sensors 222, 224, 226 and 228.

Once the four soft sensor outputs have been obtained and updated by the bias to account for dynamic variations, the results of the four predicted TOC outputs are "fused" by averaging to produce a final predicted TOC value (step 240). The average of the four predicted TOCs is set as the last predicted TOC value. It is to be appreciated that the average can be a simple average or a weighted average giving a higher weight to a certain soft sensor.

The examples are described next in the context of their application to an EGSB plant to demonstrate the predicted performance.

To demonstrate the predictive performance of the soft sensors developed by the present invention, a historical dataset containing 1438 samples was extracted from the EGSB plant's records from month 1 to month 12 of 2007 for developing various soft sensors to teach or train the soft sensor models used. For testing, 425 samples were then extracted from the plant's records from 1 month 2013 to 8 months 2014 (about one and a half years) for simulating online predictions. It is appreciated that as the number of samples used for teaching or training increases, the predictor output will also improve.

In measuring the feasibility and rationality of the method, commonly used performance measurement methods are employed, such as the predicted Relative Root Mean Square Error (RRMSEP), which compares the average predicted value of the model to the output values of the process. RRMSEP is mathematically expressed as equation 22 below.

R R M S E P = \sqrt{\frac{1}{n} Σ_{i = 1}^{n} (\frac{y_{i} - {\hat{y}}_{i}}{y_{i}})} - - - (22)

Wherein y is_iAndrespectively, the output of the system in step i and the predicted output from the dynamic model, and n is the number of samples in the test data set used to simulate the online prediction. The RRMSEP in equation (22) is used to evaluate the soft sensor developed by the present invention.

TABLE 1. predictive Performance of the developed Soft sensor

The first set of four methods in table 1 is followed by a bias update to produce the predicted performance as shown in table 1, table 1 corresponding to the flow chart in fig. 2. As can be deduced from table 1, SPLS can extract useful information in a variable manner more efficiently than PLS, thus resulting in better performance. The final step of the present invention (see the flow chart of fig. 2) is to fuse the 4 developed soft sensors to obtain better prediction performance. The prediction curves of the present invention are depicted in fig. 8 as an adjunct.

Although the above examples are described in the context of TOC prediction in an EGSB reactor process, the method may also be applied to other wastewater treatment processes, such as activated sludge treatment, membrane filtration and reverse osmosis.

In the case of activated sludge treatment, aerobic bacteria are used to degrade wastewater and contaminants present in the water. Such aerobic bacteria rely on the use of dissolved oxygen in the wastewater to convert organic contaminants into additional biomass, carbon dioxide and water. Applying the method as described in the examples, the parameter to be predicted may be the efficiency of contaminant removal based on input process parameters such as influent wastewater flow rate, concentration, dissolved oxygen level, and pH.

In the case of membrane filtration and reverse osmosis, which are physical treatment methods of water that utilize membrane filtration to remove contaminants from water, the methods described in the examples can be applied to improve the operation of membrane processes by providing predictions of membrane performance based on relevant process variables (e.g., influent quality and transmembrane pressure).

The described embodiments are advantageous in at least several respects: -

● the soft sensors developed in the present invention have the ability to handle missing values, handle large amounts of process parameters, account for large delays and large time varying characteristics, compensate for system uncertainty by bias update, filter noise in measurements, extract information more efficiently by SPLS, and achieve better prediction performance by fusing 4 soft sensors. Because the present invention employs PLS to model wastewater treatment processes, prediction of effluent quality and fault detection and isolation can be achieved simultaneously.

● the present invention provides two levels of model fusion to obtain better TOC predictions. A first level of model fusion is to fuse several PLS models into a stacked PLS soft sensor to obtain a better prediction than the original PLS model. The second level is a higher level model fusion, which can fuse models from the same method (several PLS models) or from different methods (fusing PLS and SPLS or any other regression method). For model fusion, fusing more models generally results in better predictions.

It will be understood that the above embodiments are provided by way of example of the invention only and that further modifications and improvements thereto are deemed to fall within the broad scope and ambit of the invention described herein as would be apparent to those skilled in the art. In particular-

● may use more or less process variables as inputs to the soft sensor. For example, more than 22 process variables may be used to obtain better predicted results, with more process variables being considered in cases where computing process time is not as critical.

● in these described embodiments, four soft sensors are included, and the described fusion of four soft sensors is superior to fusing only two SPLS soft sensors, for example. The number of soft sensors used for fusion is unlimited and depends on the soft sensors available, but more soft sensors are typically fused.

● it is to be appreciated that multiple soft sensors can be used for fusion. Other types of soft sensors include artificial neural networks, support vector machines, gaussian process regression, and the like.

● although the preceding example(s) have described the parameters to be predicted in the context of the TOC parameters of the effluent, it will be appreciated that other effluent parameters such as Chemical Oxygen Demand (COD), Biochemical Oxygen Demand (BOD) etc. are also understood to have a correlation with TOC. Thus, it is to be appreciated that the described systems and methods can also be used to predict TOC, COD, and BOD with simple scaling factors. For example, the ratio of COD to TOC is 2.7.

● although the examples of wastewater treatment processes described in the example(s) relate to EGSB-type wastewater treatment processes, it will be appreciated by those skilled in the art that the methods and systems may be used to predict parameter(s) of other anaerobic wastewater treatment processes to improve and enhance the accuracy of the prediction. In particular, fusion using soft sensors is likely to enhance and improve the predictive performance for a variety of applications.

It is to be further appreciated that while the present invention covers various embodiments, it also includes combinations of the discussed embodiments. Thus, features described in one embodiment that are not mutually exclusive of features described in another embodiment may be combined to form yet another embodiment of the invention.

Claims

1. A method for predicting a parameter of a wastewater treatment process, comprising the steps of:

(a) obtaining a data set from a first set of measurement devices, the data set comprising a plurality of process variables relating to parameters of the wastewater treatment process to be predicted;

(b) obtaining a predetermined number of measurements of the parameter to be predicted from a second set of measurement devices;

(c) pre-processing the obtained data set using a processor; the pretreatment step comprises the following steps: aligning the process variable to account for any time-varying properties of the process variable; and classifying the data set into an input group, the input group comprising the data set and the measured values of the parameter to be predicted;

(d) obtaining a composite output from a plurality of soft sensors; and

(e) averaging the composite output across each of the plurality of soft sensors; the averaged composite output is set as the final prediction parameter.

2. The method of claim 1, wherein the wastewater treatment process comprises an Expanded Granular Sludge Bed (EGSB) reactor and the parameter to be predicted is Total Organic Carbon (TOC) of effluent of the Expanded Granular Sludge Bed (EGSB) reactor.

3. The method according to claim 1 or 2, wherein in case a blank is detected in the data set, the preprocessing step comprises applying linear interpolation or other interpolation methods such as polynomial interpolation and spline interpolation to the data set.

4. The method of claim 3, wherein the classifying of the data set comprises expressing the data set in an autoregressive with exogenous input (ARX) structure.

5. A method according to any preceding claim, wherein the measured value of the parameter to be predicted is passed through a noise reduction filter prior to step (d).

6. The method of claim 5, wherein the noise reduction filter comprises a Savitzky-Golay filter and/or a Kalman filter.

7. The method of any preceding claim, wherein four soft sensors are included.

8. The method of claim 7, wherein the output from each of the four soft sensors is further processed to account for inaccuracies resulting from the dynamic nature of the wastewater treatment process prior to step (e).

9. The method of any of the preceding claims, wherein the soft sensor comprises an artificial neural network, a support vector machine, a gaussian process regression.

10. The method of any one of the preceding claims, wherein the plurality of soft sensors includes a Partial Least Squares (PLS) based soft sensor and a Stacked Partial Least Squares (SPLS) soft sensor.

11. The method of claim 10, wherein there are four soft sensors.

12. The method of claim 11, wherein there are two Partial Least Squares (PLS) based soft sensors and two Stacked Partial Least Squares (SPLS) based soft sensors.

13. A system for predicting a parameter of a wastewater treatment process, comprising: -

A first set of measurement devices arranged to obtain a data set comprising a plurality of process variables relating to parameters of the wastewater treatment process to be predicted;

a second set of measurement devices arranged to obtain a predetermined number of measurements of the parameter to be predicted;

a processor arranged to: -

(i) Receiving the data set and a predetermined number of measurements from the first set of measurement devices and the second set of measurement devices; the processor is operable to pre-process the obtained data sets and classify the data sets into input groups; the input set further comprises the data set and the measured values of the parameter to be predicted;

(ii) obtaining a composite output at a plurality of soft sensors; and is

(iii) Averaging the composite output across each of the plurality of soft sensors; the averaged composite output is set as the final prediction parameter.

14. The system of claim 13, wherein the wastewater treatment process comprises an Expanded Granular Sludge Bed (EGSB) reactor and the parameter to be predicted is an effluent parameter.

15. The system of claim 14, wherein the effluent parameters include, but are not limited to, Total Organic Carbon (TOC), Chemical Oxygen Demand (COD), and Biochemical Oxygen Demand (BOD) of the Expanded Granular Sludge Bed (EGSB) reactor.

16. The system of claim 14 or 15, wherein the Expanded Granular Sludge Bed (EGSB) reactor comprises at least one equalization tank and at least one inflow conditioning tank.

17. The system of any of claims 14 to 16, wherein the first set of measurement devices comprises online measurements and offline measurements.

18. The system of any of claims 14 to 16, wherein the second set of measurement devices comprises offline measurements.

19. The system of claim 16, wherein the process variable comprises: a level of wastewater in the at least one equalization tank; the flow rate, inflow temperature and outflow temperature between the balance tank and the inflow regulation tank; the pH value of the wastewater at the balance tank and the regulating tank.

20. The system of any of claims 13 to 19, wherein the plurality of soft sensors comprises a Partial Least Squares (PLS) based soft sensor and a Stacked Partial Least Squares (SPLS) soft sensor.

21. The system of claim 20, wherein there are four soft sensors.

22. The system of claim 21, wherein there are two Partial Least Squares (PLS) based soft sensors and two Stacked Partial Least Squares (SPLS) based soft sensors.