CN111027611A - Fuzzy PLS modeling method based on dynamic Bayesian network - Google Patents
Fuzzy PLS modeling method based on dynamic Bayesian network Download PDFInfo
- Publication number
- CN111027611A CN111027611A CN201911225604.5A CN201911225604A CN111027611A CN 111027611 A CN111027611 A CN 111027611A CN 201911225604 A CN201911225604 A CN 201911225604A CN 111027611 A CN111027611 A CN 111027611A
- Authority
- CN
- China
- Prior art keywords
- model
- data
- fuzzy
- bayesian network
- dynamic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 239000011159 matrix material Substances 0.000 claims abstract description 24
- 238000004065 wastewater treatment Methods 0.000 claims abstract description 11
- 230000003416 augmentation Effects 0.000 claims abstract description 8
- 238000012360 testing method Methods 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 18
- 239000013598 vector Substances 0.000 claims description 10
- 238000011156 evaluation Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000003190 augmentative effect Effects 0.000 claims description 2
- 239000002351 wastewater Substances 0.000 claims 2
- 239000003344 environmental pollutant Substances 0.000 claims 1
- 238000012854 evaluation process Methods 0.000 claims 1
- 231100000719 pollutant Toxicity 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 16
- 238000005259 measurement Methods 0.000 abstract description 15
- 230000001186 cumulative effect Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 6
- 239000000725 suspension Substances 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000691 measurement method Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010978 in-process monitoring Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 229910052698 phosphorus Inorganic materials 0.000 description 1
- 239000011574 phosphorus Substances 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Pure & Applied Mathematics (AREA)
- Evolutionary Computation (AREA)
- Mathematical Analysis (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Optimization (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Probability & Statistics with Applications (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a fuzzy PLS modeling method based on a dynamic Bayesian network, which can be used for modeling industrial processes with strong nonlinearity, time-varying property and uncertainty. Firstly, a latent variable model is established by adopting fuzzy partial least squares, so that the model has nonlinear modeling capacity; secondly, performing augmentation matrix expansion on the score matrix extracted from the latent variable model, so that the model can better adapt to the dynamic characteristics of data; finally, by combining a Bayesian network, the model can better describe the uncertainty existing in the actual industrial process; in order to verify the accuracy of model prediction, the method is used for soft measurement modeling of the wastewater treatment process. The experimental result shows that the fuzzy partial least square and dynamic Bayesian network application can obviously improve the accuracy of model prediction and is more suitable for soft measurement modeling of complex industrial processes.
Description
Technical Field
The invention relates to a soft measurement method of effluent indexes in a wastewater treatment process, in particular to a fuzzy PLS modeling method based on a dynamic Bayesian network.
Background
With the continuous development of modern industry, the production process gradually tends to be continuous and large-scale, so that the monitoring of quality indexes in the industrial process has higher requirements. The high degree of non-linearity, time-variability, and process uncertainty present in the collected data samples present significant challenges for conventional process monitoring. The process monitoring technology widely used at present is online instrument detection and offline laboratory detection, but the online instrument detection has higher cost and is difficult to maintain; the off-line laboratory detection has larger time lag, and the detection reagent can cause secondary pollution and is difficult to meet the on-line monitoring requirement of the actual production process, so that the establishment of a soft measurement model is very necessary in the industrial process monitoring.
The currently used soft measurement models include multiple linear regression, principal component analysis, partial least squares, support vector machines, decision trees, and the like. However, non-linearity and dynamic characteristics are ubiquitous in an actual industrial process, so that the basic model cannot better describe data with complex structures; in addition, the traditional method uses more variables in the soft measurement modeling process, so that not only is the model structure too complex, but also the cost for acquiring auxiliary variables is correspondingly increased. In addition, a Bayesian Network (BN) as a probability-based network structure can better process uncertainty existing in the process, but under the condition of high data dimension, the network structure is complex, and a model overfitting phenomenon is easily caused.
In the above problems, to solve the problem that the soft measurement model is too complex, a variable selection method is usually adopted, but the dimension of the acquired data is often far greater than the actual dimension required in the prediction model, and the obvious information redundancy phenomenon brings great difficulty to the soft measurement modeling. In addition, the problem of high data dimension can be solved by establishing a latent variable model, and most of original information in the data is reserved and the data dimension is reduced by selecting a latent variable with a large information content. A latent variable model which is commonly used in the latent variable models is a Partial Least Squares (PLS), but the traditional linear PLS cannot sufficiently explain the non-linear characteristics of data in an industrial process, so that the PLS method is difficult to explain the non-linearity of the process. Besides the non-linear characteristics of the data in general, the time-variability in the industrial process also brings great limitation to the modeling process, and the current common solution is to use a simple time series model. However, in an actual industrial process, data has large fluctuation and non-periodicity, so that a simple time series method has difficulty in accurately describing the dynamic characteristics of a sample.
Disclosure of Invention
The invention provides a Dynamic-Fuzzy Partial Least Squares-Bayesian networks (D-FPLS-BN) modeling method based on a Dynamic Bayesian network aiming at the problems in the prior art.
The invention adopts a fuzzy PLS modeling method based on a dynamic Bayesian network, which comprises the following steps:
s1, data preprocessing: standardizing input data X and output data Y, and eliminating the dimension of the data through the standardization of the data; and dividing the data into a training set and a testing set. And the training set is used for constructing and training the model, and the test set is used for evaluating the model.
S2, constructing an FPLS latent variable model to extract nonlinear features and reduce data dimensionality: the traditional PLS has great limitation in solving the nonlinear problem existing in the actual industrial process, so that a Fuzzy rule and a Fuzzy C-means algorithm (Fuzzy C-means, FCM) are introduced on the basis of the PLS to construct an FPLS model; meanwhile, in order to prevent the model structure from being too complex due to too high data dimension, a latent variable model of the FPLS is established by extracting latent variables with more information content in the FPLS latent variable model.
S3, constructing a dynamic model: and constructing a dynamic model for the latent variables extracted from the FPLS latent variable model in a mode of an augmentation matrix, so that the time-varying property existing in the process is overcome, and the dynamic characteristic existing in the data in the process is better described.
S4, constructing a D-FPLS-BN model: and taking the data expanded by the augmentation matrix as the input of the Bayesian network to construct the Bayesian network, thereby overcoming the uncertainty existing in the actual industry and improving the accuracy of the model for predicting the quality index.
S5, carrying out anti-standardization on the data, and finishing the evaluation of the model prediction capability: and (3) bringing the test set data into the trained model for prediction, calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data, and finishing the evaluation of the prediction capability of the model.
The method has the advantages that on the basis of the FPLS latent variable model, the dynamic model and the Bayesian network are combined, so that the D-FPLS-BN soft measurement model can overcome strong nonlinearity, time-varying property and uncertainty. Therefore, in the face of a complex wastewater treatment process, the model has higher accuracy and generalization capability; compared with the traditional sensor, the soft measurement method has higher reliability in process monitoring.
After adopting the scheme, compared with the prior art, the invention has the following effects:
compared with the prior art, the fuzzy PLS modeling method based on the dynamic Bayesian network has the beneficial effects of monitoring the quality index of the industrial process: by the soft measurement modeling method, the defects of high cost and difficult maintenance of an online instrument in the actual industry are overcome, and the problem of large time lag of offline detection is solved; latent variables are selected in the FPLS soft measurement model, so that the problem that the model is too complex due to high data dimensionality is avoided, and nonlinear characteristics of data are effectively extracted; through the construction of the dynamic model, the model has the capability of describing the dynamic characteristics of the data more accurately, and the time-varying property in the process is effectively solved; and finally, the method is combined with a Bayesian network, so that the description of the model on the process uncertainty is facilitated, and the high precision and generalization capability of the soft measurement model in the industrial process can be ensured.
Drawings
FIG. 1 is a flow chart of a fuzzy PLS soft measurement modeling method based on a dynamic Bayesian network;
FIG. 2 is a first latent variable score vector scattergram of the PLS model versus actual wastewater treatment process data;
FIG. 3 is a first latent variable score vector scattergram for actual wastewater treatment process data when the FPLS model takes different numbers of fuzzy rules;
FIG. 4 is a graph of the RMSE results of model predictions for FPLS-BN and D-FPLS-BN under different fuzzy rules.
Detailed Description
The present invention will now be described more clearly and fully hereinafter, with the understanding that the present invention has been described in connection with what is presently considered to be the most practical and preferred embodiment of the invention.
The technical scheme adopted by the invention for predicting the effluent index of wastewater treatment is as follows:
s1, data preprocessing: the input data X and the output data Y are standardized according to a formula (1); dividing a training set and a test set, wherein the training set is used for constructing a model, and the test set is used for evaluating the performance of the model;
s2, constructing an FPLS latent variable model: constructing a latent variable model among FPLS score vectors to explain the nonlinear characteristics of the data;
s3, constructing a dynamic model: extracting a scoring matrix in the FPLS latent variable model, and selecting latent variables through accumulating variance contribution rate: according to the accumulated variance contribution rate, the change is gentle after a certain latent variable, so that the latent variable is selected as the number of the latent variables of the model; the dynamic model construction is realized by an augmentation matrix and a time lag coefficient introduction mode;
s4, constructing a D-FPLS-BN model: taking the data expanded by the augmentation matrix as the input of the Bayesian network, constructing the Bayesian network, and completing the prediction of new input data;
and S5, carrying out anti-standardization on the data and finishing the evaluation of the prediction capability of the model. And (3) bringing the test set data into a model for prediction, calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data, and finishing the evaluation of the prediction capability of the model.
In step S1, the data is normalized to standard data having a mean of 0 and a variance of 1 such that E0=X,F0=Y,h=1。
The normalization formula is as follows:
in the formula, X*For the raw data, X is the normalized data, and μ and σ are the mean and variance, respectively, of all sample data.
In step S2, the FPLS latent variable model is constructed as follows:
s21: the input and output data are decomposed using a partial least squares model as follows:
in the formula, t and u are latent variables of X and Y, p and q are corresponding load variables, and E and F are corresponding residual error matrixes.
S22: computing the h-th pair of feature vectors th,uh:
th=Eh-1wh(5)
uh=Fh-1ch(8)
S23: calculating a Gaussian membership function clustering center:
wherein c isi(i ═ 1,2 …, L) is the cluster center.
S24: after data are clustered into L types, a sub-model is established for each type of data, and an input variable is defined as x ═ x1x2…xr]TModel parameter bi=[bi0bi1…bir]T。
S241: the TSK blur function is defined as:
in the formula, GiIn order to standardize the intensity of the trigger,
s242: normalized trigger intensity GiAnd the Gaussian trigger strength tau of the ith fuzzy ruleiThe calculation formulas are respectively as follows:
wherein i is 1,2, …, L, cirIs the cluster center of the ith Gaussian membership function, sigmaiIs the width of the membership function.
S243: width sigma of membership functioniThe nearest neighbor method is adopted for the calculation of (1):
wherein, ciAnd clTwo nearest cluster centers, l ═ 1,2, …, n, respectively.
S244: calculating the total output of the L TSK submodels:
s245: minimizing the objective function JG:
S25: calculating load vectors of the input and output matrixes X and Y:
s26: computing h-th group of feature vector residuals Eh、Fh:
Let h be h +1, return to step S22 to calculate so that residual matrix EhAnd FhIf the effective information contained in the data is extracted, the calculation is terminated;
in step S3, a score matrix in the FPLS latent variable model is extracted, and the dynamic model is implemented by constructing an augmentation matrix:
s31: and extracting a scoring matrix T in the FPLS latent variable model, and selecting the number of latent variables according to the accumulated variance contribution rate.
S32: the dynamic model was constructed as follows:
assuming that the input matrix of the original FPLS latent variable model is X:
will selectThe latent variable of (A) is expanded to form an amplification matrix, and a time lag coefficient d is introduced to form the amplification matrix XiComprises the following steps:
where x (t) is a certain sample point and d is a time lag coefficient.
In step S4, a D-FPLS-BN model is constructed:
s41: data X with dynamic structure expandediAs nodes of a bayesian network.
S42: and dividing the data set into a training set and a testing set, and using the training set to train the Bayesian network structure.
S43, calculating the prior distribution of the random variables ξ in the training set as pi (ξ).
S44: calculating a sample x1,x2,x3Conditional density P (x) of … versus ξ1,x2,x3,…,xm|ξ)。
S45, using Bayesian formula, according to prior distribution pi (ξ) and conditional density P (x)1,x2,x3,…,xm| ξ) calculate the posterior probability density P (ξ | x)1,x2,x3,…,xm)。
S46, making inferences about ξ in the test set using a posterior probability density:
in step S5, denormalization is performed on the data, and evaluation of the model prediction capability is completed;
and substituting the test set data into the model for prediction, and calculating a Root Square Error (RMSE) according to the predicted value and the true value of the input data, wherein the RMSE is closer to 0, which represents that the model has better accuracy. The RMSE calculation formula is as follows:
Example 1:
take the wastewater treatment process of a wastewater treatment plant as an example. The wastewater treatment data for soft measurement modeling contains 6 input variables including influent flow (Q), influent Solids Suspension (SS), and one output variablein) Biological Oxygen Demand (BOD) in waterin) Chemical Oxygen Demand (COD) of the entering waterin) Total nitrogen in water (TN)in) And Total Phosphorus (TP) in waterin) The output variable is the effluent Suspended Solids (SS)eff). The invention is further detailed in conjunction with fig. 1:
the first step is as follows: and dividing 358 groups of data into a training set and a test set, wherein the front 238 group is the training set for establishing the model, and the rear 120 groups are the test set for testing the performance of the model.
The second step is that: and decomposing the PLS model, and establishing an FPLS latent variable model by combining with a TSK fuzzy rule. The accumulated variance of the PLS model can be obtained according to the table 1, and the table 2 shows the accumulated variance of the FPLS model under different fuzzy rules; and selecting the number of the appropriate latent variables in the model according to the change of the accumulated variance, and extracting the scoring matrix. In addition, when 4 fuzzy rules are searched, under different latent variables, the fuzzy rules adopt the information extraction capability of 4 fuzzy rules; the variance contribution rate and the cumulative variance contribution rate of the output variables are shown in table 3. As shown in tables 1-3, This LV indicates the variance contribution ratio (%), Total indicates the cumulative variance contribution ratio (%), and the number of latent variables is selected by the cumulative variance contribution ratio, where the number of latent variables in the PLS method in Table 1 is 2; in table 2, the number of the FPLS _1 latent variables is 2, and the number of the FPLS _2, FPLS _3, and FPLS _4 latent variables is 3; the latent variables for FPLS _5 through FPLS _9 in Table 3 are 2, 3, 4, 5, respectively.
TABLE 1 variance contribution ratio and cumulative variance contribution ratio of PLS latent variable model
TABLE 2 variance contribution rate and cumulative variance contribution rate of FPLS latent variable model to different fuzzy rules
TABLE 3 variance contribution rate and cumulative variance contribution rate of fuzzy rules of FPLS latent variable model to different numbers of latent variables
The third step: expanding the gain matrix of the scoring matrix obtained in the latent variable model to realize the construction of a dynamic model;
the fourth step: training the network by taking the score matrix after the expansion of the augmented matrix as the input of the Bayesian network, and completing the prediction of the test set data by using the D-FPLS-BN model obtained after training;
the fifth step: and carrying out denormalization on the predicted data to finish the evaluation of the model prediction capability. And comparing the prediction accuracy of the D-FPLS-BN model with PLS, BN, PLS-BN, D-PLS-BN and FPLS-BN. FIG. 2 is a scatter plot of the input and output score vectors of the first latent variable during modeling of PLS. In fig. 3, sub-graphs formed by t (1) and u (1) are scatter graphs and internal regression graphs between input and output score vectors of the first latent variable in the FPLS modeling process under different fuzzy rules, and (a), (b), (c) and (d) are scatter graphs between score vectors of 2, 3, 4 and 5 taken by the fuzzy rules respectively; in a sub-graph formed by t (1) and FiringStrength in the graph, a dotted line represents standardized trigger strength, and solid lines represent trigger strengths corresponding to fuzzy rules respectively; from the scatter plot, one can derive: aiming at data with a stronger nonlinear structure, compared with a PLS (partial least squares) method, the FPLS has better nonlinear fitting capacity, which shows that the FPLS method has stronger nonlinear modeling capacity; FIG. 4 shows the predicted root mean square error for models under different fuzzy rules, where fuzzy rule 1 on the abscissa represents the PLS model, and 2-5 represent the FPLS models when fuzzy rules are 2, 3, 4, and 5, respectively; the ordinate is the RMSE value. In the figure, the blue line and the red line are respectively the RMSE values of the FPLD-BN and the D-FPLS-BN under the corresponding fuzzy rule, and when the fuzzy rule is 4, the FPLD-BN and the D-FPLS-BN models have relatively good prediction performance and have strong interpretation capability on nonlinear data. Table 4 lists the RMSE results predicted by 6 models for effluent SS, showing that: the RMSE of PLS and BN was 1.01 and 2.35 respectively, and the RMSE of the predicted optimal D-FPLS-BN was 0.72, which is 28.63% lower than that of the PLS method.
TABLE 4 prediction results of different models on test effluent SS
In consideration of the nonlinearity and time variability of data in the wastewater treatment process and the uncertainty of the industrial process, the prediction model in the soft measurement process is difficult to achieve a good prediction effect. The method of the invention better explains the nonlinearity of data through FPLS, and better describes the dynamic characteristics through the construction of a dynamic model; and the D-FPLS-BN model is combined with the Bayesian network, so that the D-FPLS-BN model is better suitable for soft measurement modeling of an actual industrial process.
The foregoing has described the general principles, principal features, and advantages of the invention. The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited thereto, and those skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and all such changes and substitutions are intended to be covered by the protection scope of the present invention. Therefore, the scope of the present invention should be defined by the appended claims and equivalents thereof.
Claims (6)
1. The fuzzy PLS modeling method based on the dynamic Bayesian network is characterized by comprising the following steps of:
s1, data preprocessing: standardizing input data X and output data Y, and eliminating the dimension of the data through the standardization of the data; dividing data into a training set and a testing set, using the training set for model construction and training, and using the testing set for model evaluation;
s2, constructing an FPLS latent variable model, introducing Takagi-Sugeno-Kang, a TSK Fuzzy rule and a Fuzzy C-means algorithm Fuzzy C-means, and FCM on the basis of PLS to construct the FPLS model, and extracting latent variables with more information content in the FPLS latent variable model;
s3, constructing a dynamic model: constructing a dynamic model by an augmentation matrix mode for latent variables extracted from the FPLS latent variable model;
s4, constructing a Dynamic Bayesian network Fuzzy PLS modeling method, namely a Dynamic-Fuzzy Partial least squares-Bayesian network, D-FPLS-BN model: and taking the data expanded by the augmentation matrix as the input of the Bayesian network to construct the Bayesian network.
2. The dynamic bayesian network based fuzzy PLS modeling method as claimed in claim 1, wherein the data in step 1 is derived from wastewater treatment data, the input data X comprises relevant data indicating the degree of wastewater pollution, and the output data Y is a pollutant indicator monitored at the wastewater outlet.
3. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S2 is specifically performed by:
s21: the input data X and the output data Y are decomposed using a partial least squares model as follows:
in the formula, t and u are latent variables of X and Y respectively, p and q are corresponding load variables, and E and F are corresponding residual error matrixes;
s22: computing the h-th pair of feature vectors th,uh:
th=Eh-1wh(4)
uh=Fh-1ch(7)
S23: calculating a Gaussian membership function clustering center:
wherein c isi(i ═ 1,2 …, L) as the clustering center;
s24: after data are clustered into L types, a sub-model is established for each type of data, and an input variable is defined as x ═ x1x2…xr]TModel parameter bi=[bi0bi1…bir]T;
S241: the TSK blur function is defined as:
in the formula, GiStandardized trigger strength;
s242: normalized trigger intensity GiAnd the Gaussian trigger strength tau of the ith fuzzy ruleiThe calculation formulas are respectively as follows:
wherein i is 1,2, …, L, cirIs the cluster center of the ith Gaussian membership function, sigmaiIs the width of the membership function.
S243: width sigma of membership functioniThe nearest neighbor method is adopted for the calculation of (1):
wherein, ciAnd clTwo nearest cluster centers, l ═ 1,2, …, n, respectively.
S244: calculating the total output of the L TSK submodels:
s245: minimizing the objective function JG:
4. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S3 is specifically performed by:
s31: extracting a score matrix T in the FPLS latent variable model, and selecting the number of latent variables according to the accumulated variance contribution rate;
s32: the dynamic model was constructed as follows:
setting an input matrix of an original FPLS latent variable model as X:
expanding the selected latent variable to obtain an augmented matrix X by introducing a time lag coefficient diComprises the following steps:
where x (t) is a certain sample point and d is a time lag coefficient.
5. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S4 is specifically performed by:
s41: data X with dynamic structure expandediAs nodes of a bayesian network;
s42: dividing a data set into a training set and a testing set, and using the training set to train the Bayesian network structure;
s43, calculating the prior distribution of the random variables ξ in the training set as pi (ξ);
s44: calculating a sample x1,x2,x3Conditional density P (x) of … versus ξ1,x2,x3,…,xm|ξ);
S45, using Bayesian formula, according to prior distribution pi (ξ) and conditional density P (x)1,x2,x3,…,xm| ξ) calculate the posterior probability density P (ξ | x)1,x2,x3,…,xm);
S46, making inferences about ξ in the test set using a posterior probability density:
6. the dynamic bayesian network based fuzzy PLS modeling method according to any of the claims 1 to 5, further comprising a model prediction capability evaluation process, in particular: and substituting the test set data into the trained model for prediction, and calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data to finish the evaluation of the prediction capability of the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911225604.5A CN111027611A (en) | 2019-12-04 | 2019-12-04 | Fuzzy PLS modeling method based on dynamic Bayesian network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911225604.5A CN111027611A (en) | 2019-12-04 | 2019-12-04 | Fuzzy PLS modeling method based on dynamic Bayesian network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111027611A true CN111027611A (en) | 2020-04-17 |
Family
ID=70204201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911225604.5A Pending CN111027611A (en) | 2019-12-04 | 2019-12-04 | Fuzzy PLS modeling method based on dynamic Bayesian network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111027611A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112562797A (en) * | 2020-11-30 | 2021-03-26 | 中南大学 | Method and system for predicting outlet ions in iron precipitation process |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108197380A (en) * | 2017-12-29 | 2018-06-22 | 南京林业大学 | Gauss based on offset minimum binary returns soft-measuring modeling method |
CN109492265A (en) * | 2018-10-18 | 2019-03-19 | 南京林业大学 | The kinematic nonlinearity PLS soft-measuring modeling method returned based on Gaussian process |
-
2019
- 2019-12-04 CN CN201911225604.5A patent/CN111027611A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108197380A (en) * | 2017-12-29 | 2018-06-22 | 南京林业大学 | Gauss based on offset minimum binary returns soft-measuring modeling method |
CN109492265A (en) * | 2018-10-18 | 2019-03-19 | 南京林业大学 | The kinematic nonlinearity PLS soft-measuring modeling method returned based on Gaussian process |
Non-Patent Citations (1)
Title |
---|
张昊 等: "动态模糊PLS法实现废水处理出水指标预测", 《化工自动化及仪表》, no. 6, pages 485 - 489 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112562797A (en) * | 2020-11-30 | 2021-03-26 | 中南大学 | Method and system for predicting outlet ions in iron precipitation process |
CN112562797B (en) * | 2020-11-30 | 2024-01-26 | 中南大学 | Method and system for predicting outlet ions in iron precipitation process |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106339536B (en) | Comprehensive Evaluation of Water Quality based on water pollution index's method and cloud model | |
CN108197380B (en) | Partial least square-based Gaussian process regression wastewater effluent index prediction method | |
CN107025338B (en) | Recursive RBF neural network-based sludge bulking fault identification method | |
CN109492265B (en) | Wastewater effluent index prediction method based on dynamic nonlinear PLS soft measurement method | |
CN111160776A (en) | Method for detecting abnormal working condition in sewage treatment process by utilizing block principal component analysis | |
CN109472088A (en) | A kind of shale controlled atmosphere production well production Pressure behaviour prediction technique | |
CN110309609B (en) | Building indoor air quality evaluation method based on rough set and WNN | |
CN110175425B (en) | Prediction method of residual life of gear based on MMALSTM | |
Liu et al. | Modeling of wastewater treatment processes using dynamic Bayesian networks based on fuzzy PLS | |
WO2021114320A1 (en) | Wastewater treatment process fault monitoring method using oica-rnn fusion model | |
Mao et al. | Comparative study on prediction of fuel cell performance using machine learning approaches | |
Ordieres-Meré et al. | Comparison of models created for the prediction of the mechanical properties of galvanized steel coils | |
CN114897103A (en) | Industrial process fault diagnosis method based on neighbor component loss optimization multi-scale convolutional neural network | |
CN111027611A (en) | Fuzzy PLS modeling method based on dynamic Bayesian network | |
Yang et al. | Teacher–Student Uncertainty Autoencoder for the Process-Relevant and Quality-Relevant Fault Detection in the Industrial Process | |
Han et al. | Filter transfer learning algorithm for missing data imputation in wastewater treatment process | |
Abiyev | Fuzzy wavelet neural network for prediction of electricity consumption | |
Maleki et al. | A new neural network-based control scheme for fault detection and fault diagnosis in fuzzy multivariate multinomial data | |
CN115034140A (en) | Surface water quality change trend prediction method based on key control factors | |
CN115206444A (en) | Optimal drug dosage prediction method based on FCM-ANFIS model | |
CN114692729A (en) | New energy station bad data identification and correction method based on deep learning | |
Parvizi Moghadam et al. | Optimization of time‐variable‐parameter model for data‐based soft sensor of industrial debutanizer | |
CN114580151A (en) | Water demand prediction method based on gray linear regression-Markov chain model | |
CN114841000B (en) | Soft measurement modeling method based on modal common feature separation | |
CN114384870B (en) | Complex industrial process running state evaluation method based on nuclear local linear embedded PLS |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |