CN111027611A - Fuzzy PLS modeling method based on dynamic Bayesian network - Google Patents

Fuzzy PLS modeling method based on dynamic Bayesian network Download PDF

Info

Publication number
CN111027611A
CN111027611A CN201911225604.5A CN201911225604A CN111027611A CN 111027611 A CN111027611 A CN 111027611A CN 201911225604 A CN201911225604 A CN 201911225604A CN 111027611 A CN111027611 A CN 111027611A
Authority
CN
China
Prior art keywords
model
data
fuzzy
bayesian network
dynamic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911225604.5A
Other languages
Chinese (zh)
Inventor
刘鸿斌
张昊
张凤山
景宜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Forestry University
Original Assignee
Nanjing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Forestry University filed Critical Nanjing Forestry University
Priority to CN201911225604.5A priority Critical patent/CN111027611A/en
Publication of CN111027611A publication Critical patent/CN111027611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Optimization (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a fuzzy PLS modeling method based on a dynamic Bayesian network, which can be used for modeling industrial processes with strong nonlinearity, time-varying property and uncertainty. Firstly, a latent variable model is established by adopting fuzzy partial least squares, so that the model has nonlinear modeling capacity; secondly, performing augmentation matrix expansion on the score matrix extracted from the latent variable model, so that the model can better adapt to the dynamic characteristics of data; finally, by combining a Bayesian network, the model can better describe the uncertainty existing in the actual industrial process; in order to verify the accuracy of model prediction, the method is used for soft measurement modeling of the wastewater treatment process. The experimental result shows that the fuzzy partial least square and dynamic Bayesian network application can obviously improve the accuracy of model prediction and is more suitable for soft measurement modeling of complex industrial processes.

Description

Fuzzy PLS modeling method based on dynamic Bayesian network
Technical Field
The invention relates to a soft measurement method of effluent indexes in a wastewater treatment process, in particular to a fuzzy PLS modeling method based on a dynamic Bayesian network.
Background
With the continuous development of modern industry, the production process gradually tends to be continuous and large-scale, so that the monitoring of quality indexes in the industrial process has higher requirements. The high degree of non-linearity, time-variability, and process uncertainty present in the collected data samples present significant challenges for conventional process monitoring. The process monitoring technology widely used at present is online instrument detection and offline laboratory detection, but the online instrument detection has higher cost and is difficult to maintain; the off-line laboratory detection has larger time lag, and the detection reagent can cause secondary pollution and is difficult to meet the on-line monitoring requirement of the actual production process, so that the establishment of a soft measurement model is very necessary in the industrial process monitoring.
The currently used soft measurement models include multiple linear regression, principal component analysis, partial least squares, support vector machines, decision trees, and the like. However, non-linearity and dynamic characteristics are ubiquitous in an actual industrial process, so that the basic model cannot better describe data with complex structures; in addition, the traditional method uses more variables in the soft measurement modeling process, so that not only is the model structure too complex, but also the cost for acquiring auxiliary variables is correspondingly increased. In addition, a Bayesian Network (BN) as a probability-based network structure can better process uncertainty existing in the process, but under the condition of high data dimension, the network structure is complex, and a model overfitting phenomenon is easily caused.
In the above problems, to solve the problem that the soft measurement model is too complex, a variable selection method is usually adopted, but the dimension of the acquired data is often far greater than the actual dimension required in the prediction model, and the obvious information redundancy phenomenon brings great difficulty to the soft measurement modeling. In addition, the problem of high data dimension can be solved by establishing a latent variable model, and most of original information in the data is reserved and the data dimension is reduced by selecting a latent variable with a large information content. A latent variable model which is commonly used in the latent variable models is a Partial Least Squares (PLS), but the traditional linear PLS cannot sufficiently explain the non-linear characteristics of data in an industrial process, so that the PLS method is difficult to explain the non-linearity of the process. Besides the non-linear characteristics of the data in general, the time-variability in the industrial process also brings great limitation to the modeling process, and the current common solution is to use a simple time series model. However, in an actual industrial process, data has large fluctuation and non-periodicity, so that a simple time series method has difficulty in accurately describing the dynamic characteristics of a sample.
Disclosure of Invention
The invention provides a Dynamic-Fuzzy Partial Least Squares-Bayesian networks (D-FPLS-BN) modeling method based on a Dynamic Bayesian network aiming at the problems in the prior art.
The invention adopts a fuzzy PLS modeling method based on a dynamic Bayesian network, which comprises the following steps:
s1, data preprocessing: standardizing input data X and output data Y, and eliminating the dimension of the data through the standardization of the data; and dividing the data into a training set and a testing set. And the training set is used for constructing and training the model, and the test set is used for evaluating the model.
S2, constructing an FPLS latent variable model to extract nonlinear features and reduce data dimensionality: the traditional PLS has great limitation in solving the nonlinear problem existing in the actual industrial process, so that a Fuzzy rule and a Fuzzy C-means algorithm (Fuzzy C-means, FCM) are introduced on the basis of the PLS to construct an FPLS model; meanwhile, in order to prevent the model structure from being too complex due to too high data dimension, a latent variable model of the FPLS is established by extracting latent variables with more information content in the FPLS latent variable model.
S3, constructing a dynamic model: and constructing a dynamic model for the latent variables extracted from the FPLS latent variable model in a mode of an augmentation matrix, so that the time-varying property existing in the process is overcome, and the dynamic characteristic existing in the data in the process is better described.
S4, constructing a D-FPLS-BN model: and taking the data expanded by the augmentation matrix as the input of the Bayesian network to construct the Bayesian network, thereby overcoming the uncertainty existing in the actual industry and improving the accuracy of the model for predicting the quality index.
S5, carrying out anti-standardization on the data, and finishing the evaluation of the model prediction capability: and (3) bringing the test set data into the trained model for prediction, calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data, and finishing the evaluation of the prediction capability of the model.
The method has the advantages that on the basis of the FPLS latent variable model, the dynamic model and the Bayesian network are combined, so that the D-FPLS-BN soft measurement model can overcome strong nonlinearity, time-varying property and uncertainty. Therefore, in the face of a complex wastewater treatment process, the model has higher accuracy and generalization capability; compared with the traditional sensor, the soft measurement method has higher reliability in process monitoring.
After adopting the scheme, compared with the prior art, the invention has the following effects:
compared with the prior art, the fuzzy PLS modeling method based on the dynamic Bayesian network has the beneficial effects of monitoring the quality index of the industrial process: by the soft measurement modeling method, the defects of high cost and difficult maintenance of an online instrument in the actual industry are overcome, and the problem of large time lag of offline detection is solved; latent variables are selected in the FPLS soft measurement model, so that the problem that the model is too complex due to high data dimensionality is avoided, and nonlinear characteristics of data are effectively extracted; through the construction of the dynamic model, the model has the capability of describing the dynamic characteristics of the data more accurately, and the time-varying property in the process is effectively solved; and finally, the method is combined with a Bayesian network, so that the description of the model on the process uncertainty is facilitated, and the high precision and generalization capability of the soft measurement model in the industrial process can be ensured.
Drawings
FIG. 1 is a flow chart of a fuzzy PLS soft measurement modeling method based on a dynamic Bayesian network;
FIG. 2 is a first latent variable score vector scattergram of the PLS model versus actual wastewater treatment process data;
FIG. 3 is a first latent variable score vector scattergram for actual wastewater treatment process data when the FPLS model takes different numbers of fuzzy rules;
FIG. 4 is a graph of the RMSE results of model predictions for FPLS-BN and D-FPLS-BN under different fuzzy rules.
Detailed Description
The present invention will now be described more clearly and fully hereinafter, with the understanding that the present invention has been described in connection with what is presently considered to be the most practical and preferred embodiment of the invention.
The technical scheme adopted by the invention for predicting the effluent index of wastewater treatment is as follows:
s1, data preprocessing: the input data X and the output data Y are standardized according to a formula (1); dividing a training set and a test set, wherein the training set is used for constructing a model, and the test set is used for evaluating the performance of the model;
s2, constructing an FPLS latent variable model: constructing a latent variable model among FPLS score vectors to explain the nonlinear characteristics of the data;
s3, constructing a dynamic model: extracting a scoring matrix in the FPLS latent variable model, and selecting latent variables through accumulating variance contribution rate: according to the accumulated variance contribution rate, the change is gentle after a certain latent variable, so that the latent variable is selected as the number of the latent variables of the model; the dynamic model construction is realized by an augmentation matrix and a time lag coefficient introduction mode;
s4, constructing a D-FPLS-BN model: taking the data expanded by the augmentation matrix as the input of the Bayesian network, constructing the Bayesian network, and completing the prediction of new input data;
and S5, carrying out anti-standardization on the data and finishing the evaluation of the prediction capability of the model. And (3) bringing the test set data into a model for prediction, calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data, and finishing the evaluation of the prediction capability of the model.
In step S1, the data is normalized to standard data having a mean of 0 and a variance of 1 such that E0=X,F0=Y,h=1。
The normalization formula is as follows:
Figure BDA0002302128740000031
in the formula, X*For the raw data, X is the normalized data, and μ and σ are the mean and variance, respectively, of all sample data.
In step S2, the FPLS latent variable model is constructed as follows:
s21: the input and output data are decomposed using a partial least squares model as follows:
Figure BDA0002302128740000032
in the formula, t and u are latent variables of X and Y, p and q are corresponding load variables, and E and F are corresponding residual error matrixes.
S22: computing the h-th pair of feature vectors th,uh
Figure BDA0002302128740000041
Figure BDA0002302128740000042
th=Eh-1wh(5)
Figure BDA0002302128740000043
Figure BDA0002302128740000044
uh=Fh-1ch(8)
S23: calculating a Gaussian membership function clustering center:
Figure BDA0002302128740000045
Figure BDA0002302128740000046
wherein c isi(i ═ 1,2 …, L) is the cluster center.
S24: after data are clustered into L types, a sub-model is established for each type of data, and an input variable is defined as x ═ x1x2…xr]TModel parameter bi=[bi0bi1…bir]T
S241: the TSK blur function is defined as:
Figure BDA0002302128740000047
in the formula, GiIn order to standardize the intensity of the trigger,
s242: normalized trigger intensity GiAnd the Gaussian trigger strength tau of the ith fuzzy ruleiThe calculation formulas are respectively as follows:
Figure BDA0002302128740000051
Figure BDA0002302128740000052
wherein i is 1,2, …, L, cirIs the cluster center of the ith Gaussian membership function, sigmaiIs the width of the membership function.
S243: width sigma of membership functioniThe nearest neighbor method is adopted for the calculation of (1):
Figure BDA0002302128740000053
wherein, ciAnd clTwo nearest cluster centers, l ═ 1,2, …, n, respectively.
S244: calculating the total output of the L TSK submodels:
Figure BDA0002302128740000054
s245: minimizing the objective function JG
Figure BDA0002302128740000055
S25: calculating load vectors of the input and output matrixes X and Y:
Figure BDA0002302128740000056
Figure BDA0002302128740000057
s26: computing h-th group of feature vector residuals Eh、Fh
Figure BDA0002302128740000058
Figure BDA0002302128740000059
Let h be h +1, return to step S22 to calculate so that residual matrix EhAnd FhIf the effective information contained in the data is extracted, the calculation is terminated;
in step S3, a score matrix in the FPLS latent variable model is extracted, and the dynamic model is implemented by constructing an augmentation matrix:
s31: and extracting a scoring matrix T in the FPLS latent variable model, and selecting the number of latent variables according to the accumulated variance contribution rate.
S32: the dynamic model was constructed as follows:
assuming that the input matrix of the original FPLS latent variable model is X:
Figure BDA0002302128740000061
will selectThe latent variable of (A) is expanded to form an amplification matrix, and a time lag coefficient d is introduced to form the amplification matrix XiComprises the following steps:
Figure BDA0002302128740000062
where x (t) is a certain sample point and d is a time lag coefficient.
In step S4, a D-FPLS-BN model is constructed:
s41: data X with dynamic structure expandediAs nodes of a bayesian network.
S42: and dividing the data set into a training set and a testing set, and using the training set to train the Bayesian network structure.
S43, calculating the prior distribution of the random variables ξ in the training set as pi (ξ).
S44: calculating a sample x1,x2,x3Conditional density P (x) of … versus ξ1,x2,x3,…,xm|ξ)。
S45, using Bayesian formula, according to prior distribution pi (ξ) and conditional density P (x)1,x2,x3,…,xm| ξ) calculate the posterior probability density P (ξ | x)1,x2,x3,…,xm)。
S46, making inferences about ξ in the test set using a posterior probability density:
Figure BDA0002302128740000063
in step S5, denormalization is performed on the data, and evaluation of the model prediction capability is completed;
and substituting the test set data into the model for prediction, and calculating a Root Square Error (RMSE) according to the predicted value and the true value of the input data, wherein the RMSE is closer to 0, which represents that the model has better accuracy. The RMSE calculation formula is as follows:
Figure BDA0002302128740000071
in the formula, yiIn order to be the true value of the value,
Figure BDA0002302128740000072
for the estimation, N is the number of samples.
Example 1:
take the wastewater treatment process of a wastewater treatment plant as an example. The wastewater treatment data for soft measurement modeling contains 6 input variables including influent flow (Q), influent Solids Suspension (SS), and one output variablein) Biological Oxygen Demand (BOD) in waterin) Chemical Oxygen Demand (COD) of the entering waterin) Total nitrogen in water (TN)in) And Total Phosphorus (TP) in waterin) The output variable is the effluent Suspended Solids (SS)eff). The invention is further detailed in conjunction with fig. 1:
the first step is as follows: and dividing 358 groups of data into a training set and a test set, wherein the front 238 group is the training set for establishing the model, and the rear 120 groups are the test set for testing the performance of the model.
The second step is that: and decomposing the PLS model, and establishing an FPLS latent variable model by combining with a TSK fuzzy rule. The accumulated variance of the PLS model can be obtained according to the table 1, and the table 2 shows the accumulated variance of the FPLS model under different fuzzy rules; and selecting the number of the appropriate latent variables in the model according to the change of the accumulated variance, and extracting the scoring matrix. In addition, when 4 fuzzy rules are searched, under different latent variables, the fuzzy rules adopt the information extraction capability of 4 fuzzy rules; the variance contribution rate and the cumulative variance contribution rate of the output variables are shown in table 3. As shown in tables 1-3, This LV indicates the variance contribution ratio (%), Total indicates the cumulative variance contribution ratio (%), and the number of latent variables is selected by the cumulative variance contribution ratio, where the number of latent variables in the PLS method in Table 1 is 2; in table 2, the number of the FPLS _1 latent variables is 2, and the number of the FPLS _2, FPLS _3, and FPLS _4 latent variables is 3; the latent variables for FPLS _5 through FPLS _9 in Table 3 are 2, 3, 4, 5, respectively.
TABLE 1 variance contribution ratio and cumulative variance contribution ratio of PLS latent variable model
Figure BDA0002302128740000073
TABLE 2 variance contribution rate and cumulative variance contribution rate of FPLS latent variable model to different fuzzy rules
Figure BDA0002302128740000081
TABLE 3 variance contribution rate and cumulative variance contribution rate of fuzzy rules of FPLS latent variable model to different numbers of latent variables
Figure BDA0002302128740000082
The third step: expanding the gain matrix of the scoring matrix obtained in the latent variable model to realize the construction of a dynamic model;
the fourth step: training the network by taking the score matrix after the expansion of the augmented matrix as the input of the Bayesian network, and completing the prediction of the test set data by using the D-FPLS-BN model obtained after training;
the fifth step: and carrying out denormalization on the predicted data to finish the evaluation of the model prediction capability. And comparing the prediction accuracy of the D-FPLS-BN model with PLS, BN, PLS-BN, D-PLS-BN and FPLS-BN. FIG. 2 is a scatter plot of the input and output score vectors of the first latent variable during modeling of PLS. In fig. 3, sub-graphs formed by t (1) and u (1) are scatter graphs and internal regression graphs between input and output score vectors of the first latent variable in the FPLS modeling process under different fuzzy rules, and (a), (b), (c) and (d) are scatter graphs between score vectors of 2, 3, 4 and 5 taken by the fuzzy rules respectively; in a sub-graph formed by t (1) and FiringStrength in the graph, a dotted line represents standardized trigger strength, and solid lines represent trigger strengths corresponding to fuzzy rules respectively; from the scatter plot, one can derive: aiming at data with a stronger nonlinear structure, compared with a PLS (partial least squares) method, the FPLS has better nonlinear fitting capacity, which shows that the FPLS method has stronger nonlinear modeling capacity; FIG. 4 shows the predicted root mean square error for models under different fuzzy rules, where fuzzy rule 1 on the abscissa represents the PLS model, and 2-5 represent the FPLS models when fuzzy rules are 2, 3, 4, and 5, respectively; the ordinate is the RMSE value. In the figure, the blue line and the red line are respectively the RMSE values of the FPLD-BN and the D-FPLS-BN under the corresponding fuzzy rule, and when the fuzzy rule is 4, the FPLD-BN and the D-FPLS-BN models have relatively good prediction performance and have strong interpretation capability on nonlinear data. Table 4 lists the RMSE results predicted by 6 models for effluent SS, showing that: the RMSE of PLS and BN was 1.01 and 2.35 respectively, and the RMSE of the predicted optimal D-FPLS-BN was 0.72, which is 28.63% lower than that of the PLS method.
TABLE 4 prediction results of different models on test effluent SS
Figure BDA0002302128740000091
In consideration of the nonlinearity and time variability of data in the wastewater treatment process and the uncertainty of the industrial process, the prediction model in the soft measurement process is difficult to achieve a good prediction effect. The method of the invention better explains the nonlinearity of data through FPLS, and better describes the dynamic characteristics through the construction of a dynamic model; and the D-FPLS-BN model is combined with the Bayesian network, so that the D-FPLS-BN model is better suitable for soft measurement modeling of an actual industrial process.
The foregoing has described the general principles, principal features, and advantages of the invention. The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited thereto, and those skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and all such changes and substitutions are intended to be covered by the protection scope of the present invention. Therefore, the scope of the present invention should be defined by the appended claims and equivalents thereof.

Claims (6)

1. The fuzzy PLS modeling method based on the dynamic Bayesian network is characterized by comprising the following steps of:
s1, data preprocessing: standardizing input data X and output data Y, and eliminating the dimension of the data through the standardization of the data; dividing data into a training set and a testing set, using the training set for model construction and training, and using the testing set for model evaluation;
s2, constructing an FPLS latent variable model, introducing Takagi-Sugeno-Kang, a TSK Fuzzy rule and a Fuzzy C-means algorithm Fuzzy C-means, and FCM on the basis of PLS to construct the FPLS model, and extracting latent variables with more information content in the FPLS latent variable model;
s3, constructing a dynamic model: constructing a dynamic model by an augmentation matrix mode for latent variables extracted from the FPLS latent variable model;
s4, constructing a Dynamic Bayesian network Fuzzy PLS modeling method, namely a Dynamic-Fuzzy Partial least squares-Bayesian network, D-FPLS-BN model: and taking the data expanded by the augmentation matrix as the input of the Bayesian network to construct the Bayesian network.
2. The dynamic bayesian network based fuzzy PLS modeling method as claimed in claim 1, wherein the data in step 1 is derived from wastewater treatment data, the input data X comprises relevant data indicating the degree of wastewater pollution, and the output data Y is a pollutant indicator monitored at the wastewater outlet.
3. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S2 is specifically performed by:
s21: the input data X and the output data Y are decomposed using a partial least squares model as follows:
Figure FDA0002302128730000011
in the formula, t and u are latent variables of X and Y respectively, p and q are corresponding load variables, and E and F are corresponding residual error matrixes;
s22: computing the h-th pair of feature vectors th,uh
Figure FDA0002302128730000012
Figure FDA0002302128730000013
th=Eh-1wh(4)
Figure FDA0002302128730000014
Figure FDA0002302128730000021
uh=Fh-1ch(7)
S23: calculating a Gaussian membership function clustering center:
Figure FDA0002302128730000022
Figure FDA0002302128730000023
wherein c isi(i ═ 1,2 …, L) as the clustering center;
s24: after data are clustered into L types, a sub-model is established for each type of data, and an input variable is defined as x ═ x1x2…xr]TModel parameter bi=[bi0bi1…bir]T
S241: the TSK blur function is defined as:
Figure FDA0002302128730000024
in the formula, GiStandardized trigger strength;
s242: normalized trigger intensity GiAnd the Gaussian trigger strength tau of the ith fuzzy ruleiThe calculation formulas are respectively as follows:
Figure FDA0002302128730000025
Figure FDA0002302128730000026
wherein i is 1,2, …, L, cirIs the cluster center of the ith Gaussian membership function, sigmaiIs the width of the membership function.
S243: width sigma of membership functioniThe nearest neighbor method is adopted for the calculation of (1):
Figure FDA0002302128730000027
wherein, ciAnd clTwo nearest cluster centers, l ═ 1,2, …, n, respectively.
S244: calculating the total output of the L TSK submodels:
Figure FDA0002302128730000031
s245: minimizing the objective function JG
Figure FDA0002302128730000032
4. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S3 is specifically performed by:
s31: extracting a score matrix T in the FPLS latent variable model, and selecting the number of latent variables according to the accumulated variance contribution rate;
s32: the dynamic model was constructed as follows:
setting an input matrix of an original FPLS latent variable model as X:
Figure FDA0002302128730000033
expanding the selected latent variable to obtain an augmented matrix X by introducing a time lag coefficient diComprises the following steps:
Figure FDA0002302128730000034
where x (t) is a certain sample point and d is a time lag coefficient.
5. The fuzzy PLS modeling method based on dynamic bayesian network as claimed in claim 1, wherein said step S4 is specifically performed by:
s41: data X with dynamic structure expandediAs nodes of a bayesian network;
s42: dividing a data set into a training set and a testing set, and using the training set to train the Bayesian network structure;
s43, calculating the prior distribution of the random variables ξ in the training set as pi (ξ);
s44: calculating a sample x1,x2,x3Conditional density P (x) of … versus ξ1,x2,x3,…,xm|ξ);
S45, using Bayesian formula, according to prior distribution pi (ξ) and conditional density P (x)1,x2,x3,…,xm| ξ) calculate the posterior probability density P (ξ | x)1,x2,x3,…,xm);
S46, making inferences about ξ in the test set using a posterior probability density:
Figure FDA0002302128730000041
6. the dynamic bayesian network based fuzzy PLS modeling method according to any of the claims 1 to 5, further comprising a model prediction capability evaluation process, in particular: and substituting the test set data into the trained model for prediction, and calculating Root Mean Square Error (RMSE) according to the predicted value and the true value of the input data to finish the evaluation of the prediction capability of the model.
CN201911225604.5A 2019-12-04 2019-12-04 Fuzzy PLS modeling method based on dynamic Bayesian network Pending CN111027611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911225604.5A CN111027611A (en) 2019-12-04 2019-12-04 Fuzzy PLS modeling method based on dynamic Bayesian network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911225604.5A CN111027611A (en) 2019-12-04 2019-12-04 Fuzzy PLS modeling method based on dynamic Bayesian network

Publications (1)

Publication Number Publication Date
CN111027611A true CN111027611A (en) 2020-04-17

Family

ID=70204201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911225604.5A Pending CN111027611A (en) 2019-12-04 2019-12-04 Fuzzy PLS modeling method based on dynamic Bayesian network

Country Status (1)

Country Link
CN (1) CN111027611A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562797A (en) * 2020-11-30 2021-03-26 中南大学 Method and system for predicting outlet ions in iron precipitation process

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197380A (en) * 2017-12-29 2018-06-22 南京林业大学 Gauss based on offset minimum binary returns soft-measuring modeling method
CN109492265A (en) * 2018-10-18 2019-03-19 南京林业大学 The kinematic nonlinearity PLS soft-measuring modeling method returned based on Gaussian process

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197380A (en) * 2017-12-29 2018-06-22 南京林业大学 Gauss based on offset minimum binary returns soft-measuring modeling method
CN109492265A (en) * 2018-10-18 2019-03-19 南京林业大学 The kinematic nonlinearity PLS soft-measuring modeling method returned based on Gaussian process

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张昊 等: "动态模糊PLS法实现废水处理出水指标预测", 《化工自动化及仪表》, no. 6, pages 485 - 489 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112562797A (en) * 2020-11-30 2021-03-26 中南大学 Method and system for predicting outlet ions in iron precipitation process
CN112562797B (en) * 2020-11-30 2024-01-26 中南大学 Method and system for predicting outlet ions in iron precipitation process

Similar Documents

Publication Publication Date Title
CN106339536B (en) Comprehensive Evaluation of Water Quality based on water pollution index's method and cloud model
CN108197380B (en) Partial least square-based Gaussian process regression wastewater effluent index prediction method
CN107025338B (en) Recursive RBF neural network-based sludge bulking fault identification method
CN109492265B (en) Wastewater effluent index prediction method based on dynamic nonlinear PLS soft measurement method
CN111160776A (en) Method for detecting abnormal working condition in sewage treatment process by utilizing block principal component analysis
CN109472088A (en) A kind of shale controlled atmosphere production well production Pressure behaviour prediction technique
CN110309609B (en) Building indoor air quality evaluation method based on rough set and WNN
CN110175425B (en) Prediction method of residual life of gear based on MMALSTM
Liu et al. Modeling of wastewater treatment processes using dynamic Bayesian networks based on fuzzy PLS
WO2021114320A1 (en) Wastewater treatment process fault monitoring method using oica-rnn fusion model
Mao et al. Comparative study on prediction of fuel cell performance using machine learning approaches
Ordieres-Meré et al. Comparison of models created for the prediction of the mechanical properties of galvanized steel coils
CN114897103A (en) Industrial process fault diagnosis method based on neighbor component loss optimization multi-scale convolutional neural network
CN111027611A (en) Fuzzy PLS modeling method based on dynamic Bayesian network
Yang et al. Teacher–Student Uncertainty Autoencoder for the Process-Relevant and Quality-Relevant Fault Detection in the Industrial Process
Han et al. Filter transfer learning algorithm for missing data imputation in wastewater treatment process
Abiyev Fuzzy wavelet neural network for prediction of electricity consumption
Maleki et al. A new neural network-based control scheme for fault detection and fault diagnosis in fuzzy multivariate multinomial data
CN115034140A (en) Surface water quality change trend prediction method based on key control factors
CN115206444A (en) Optimal drug dosage prediction method based on FCM-ANFIS model
CN114692729A (en) New energy station bad data identification and correction method based on deep learning
Parvizi Moghadam et al. Optimization of time‐variable‐parameter model for data‐based soft sensor of industrial debutanizer
CN114580151A (en) Water demand prediction method based on gray linear regression-Markov chain model
CN114841000B (en) Soft measurement modeling method based on modal common feature separation
CN114384870B (en) Complex industrial process running state evaluation method based on nuclear local linear embedded PLS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination