CN112541558A - Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data - Google Patents

Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data Download PDF

Info

Publication number
CN112541558A
CN112541558A CN202011576333.0A CN202011576333A CN112541558A CN 112541558 A CN112541558 A CN 112541558A CN 202011576333 A CN202011576333 A CN 202011576333A CN 112541558 A CN112541558 A CN 112541558A
Authority
CN
China
Prior art keywords
distribution
data
bayesian
parameter
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011576333.0A
Other languages
Chinese (zh)
Inventor
任世锦
唐娴
潘剑寒
魏明生
苏陈澄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Normal University
Original Assignee
Jiangsu Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Normal University filed Critical Jiangsu Normal University
Publication of CN112541558A publication Critical patent/CN112541558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification

Abstract

The invention discloses a Bayesian semi-supervised robust PPLS (Bayesian semi-supervised robust PPLS) soft measurement method based on incomplete data, which is a Bayesian semi-supervised robust PPLS (BSRPPLS) fault monitoring method based on incomplete data, and is different from the traditional multivariate student distribution-based PPLS modeling method, wherein independent student distribution is used for modeling noise of each data vector, and an adjustable robust degree of freedom parameter is contained in the utilized distribution, so that the modeling flexibility is improved; solving the estimated posterior distribution parameters by using a Bayesian variational inference method; the model can reconstruct original data by using pollution-free data elements, reduces the influence of the polluted elements on reconstructed data, solves the problems of data loss and influence on model precision in a wild point, has good robustness, and is favorable for improving the monitoring performance of the industrial process and the understanding and cognition level of process running.

Description

Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data
Technical Field
The invention belongs to the technical field of PPLS soft measurement, and particularly relates to a Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data.
Background
With the advent of the industrial 4.0 era, modern industrial automation systems are continuously developing towards the trend of complication, informatization and intellectualization. Process monitoring has become an indispensable important component of modern complex industrial systems as a key for ensuring the stability of product quality and the safe and stable operation of process production equipment. In the actual process, due to the external environment change, the fluctuation of the quality of raw materials, the accuracy of the measuring equipment and the complexity of the equipment, a process mathematical monitoring model which is difficult to directly establish is difficult. Therefore, the data-based process monitoring theory and technology can help operators and engineers to further know the relevant knowledge of the production process, receive general attention of people, and achieve better effects in practical application [1-5 ]. Typical process monitoring methods mainly include Principal Component Analysis (PCA) and its modified forms, Partial Least Squares (PLS), Gaussian Mixture Model (GMM), and other statistical learning methods [5-8 ]. Considering the locality of process modal data and the fact that process essential features are often located in a data low-dimensional space, manifold learning has a powerful ability to describe the geometry of data, reduces in a non-linear dimension and represents an excellent representation of local characteristics of data. The method fully utilizes the advantages of the manifold learning description data structure and a classical statistical analysis method, and is a feasible method for improving the accuracy and the understandability of fault diagnosis. Common manifold learning methods for fault diagnosis are mainly Maximum Variance Unfolding (MVU), statistical Local Preserving Projections (LPP), Neighbor Preserving Embedding (NPE) and their expanded forms [9-13 ].
In practice, process data, variables and various complexities of the system itself, and in addition, the startup and shutdown of the production process or the switching process of each operation condition often have strong dynamic characteristics, which causes the data representation form of the industrial process to be very complex. The complexity is mainly expressed in the wild points and the loss, and the industrial process variables are expressed in the dynamic property, the Gaussian and non-Gaussian characteristics and the randomness. Due to the complexity of industrial process data, process operation and quality characteristics are often hidden in the data, presenting challenges to process monitoring, pattern understanding, and acquisition of knowledge of industrial operations. How to extract hidden data features from the complex and uncertain process data and improve the accuracy of process fault diagnosis and the understandability of a process operation mechanism is a key basic problem of process fault diagnosis. The process operation characteristics are often located in a hidden low-dimensional space, process data are subjected to external environment and equipment operation condition change to represent strong random uncertainty, and a modeling method for improving the robustness and the information extraction capability of a model gradually becomes an important process fault monitoring and soft measurement modeling method [15-21] by utilizing the strong capabilities of a linear hidden variable model in data dimension reduction and information extraction and a probability map theory in random data modeling. Considering the uncertainty such as environmental noise and the like and the problem of difference in PCA principal component space and residual space statistics, the document [15] proposes a whole-local factor number determination method capable of automatically determining the factor number based on a Factor Analysis (FA) model, and constructs monitoring statistics by using NLLP added with variable variance information. The student t-distribution can approximate Gaussian/non-Gaussian distribution capability, a PICA algorithm based on t-distribution is proposed in the article [7], and two-stage probability ICA (probabilistic ICA, PICA) and PPCA are proposed on the basis to respectively extract Gaussian and non-Gaussian information in data, so that the comprehensiveness of the mode is improved. Text [16] introduces a bayesian regularization factor into mixed Principal Component Regression (PCR), automatically selecting the number of principal component components. Student t-distribution belongs to generalized Gaussian distribution, is more suitable for simulating non-Gaussian distribution, and becomes an important method for modeling in a non-Gaussian process and improving model robustness [7, 20 ]. The text [20] provides a robust filter based on student t-distribution aiming at the problem that process noise and observation noise are both student t-distribution, and improves the navigation precision of the integrated navigation system and the adaptability to special conditions by fusing information of a plurality of sensors. In consideration of the problems of data missing and outliers in the process, the document [7] proposes a robust PPCA (robust predictive PCA, RPPCA) algorithm based on an EM optimization method, describes non-gaussian hidden variables by using independent t-distributions, and discusses a PCA modeling method for missing data. In order to improve the robustness of the hybrid model, the method [21] proposes a semi-supervised robust hybrid linear regression modeling method based on t-distribution obeyed by input variables, and is successfully applied to multi-modal process quality prediction. Document [22] proposes a Maximum-likelihood mixture factor analysis model (MLMFA) to solve the noise factor, non-gaussian component, and multi-modal problems. The problems of non-gaussian noise distribution and measurement data loss of the actual system sensor are solved, and a robust principal component analysis method of incomplete data is provided by the article [22] and is successfully used for data noise elimination.
The Partial Least Squares (PLS) model is a widely used industrial soft-sensing and fault monitoring technique, with great advances in both theoretical and practical research [19,23,24 ]. The text [19] researches an implicit variable model-PLS modeling theory and a probabilistic integrated modeling method of the implicit variable model in detail based on a probability map theory, and compares a plurality of implicit variable models. A detailed review of the recent development of PLS has been made by Anauspicious jade, and it can be seen that PLS and its expanded form remain important tools for soft measurement and fault monitoring of industrial processes [22 ]. However, conventional PLS is primarily directed to a fair amount of process (input) data and quality (output) data, complete data, and gaussian noise. The actual industrial process has the characteristics of high cost for acquiring quality data, frequent measurement data loss, deviation of measurement noise from Gaussian distribution, random uncertainty and the like, and the performance of the PLS model is seriously influenced. In order to fully utilize information of marked unmarked process data, eliminate the influence of missing data and outliers on PLS modeling and accurately model non-Gaussian noise.
[1].Zeyu Yang,Zhiqiang Ge.Industrial Virtual Sensing for Big Process Data based on Parallelized Nonlinear Variational Bayesian Factor Regression.IEEE Transactions on Instrumentation and Measurement,2020
[2]Jing Yang,Guo Xie a,Yanxi Yang.An improved ensemble fusion autoencoder model for fault diagnosis from imbalanced and incomplete data.Control Engineering Practice,98(2020) 104358
[3].Tipping,M.E.,Lawrence,N.D.Variational inference for Students’t-models:robust Bayesian interpolation and generalized component analysis.Neurocomputing,69:123-141,2005.
[4].Bei Wang,Zhichao Li,Zhenwen Dai,Neil Lawrence,Xuefeng Yan.Data-driven mode identification and unsupervised fault detection for nonlinear multimode processes.IEEE Transactions on Industrial informatics,16(6):3651-3660,2020
[5].Wende Tian,Yujia Ren,Yuxi Dong,Shaoguang Wang,Lingzhen Bu.Fault monitoring based on mutual information feature engineering modeling in chemical process.Chinese Journal of Chemical Engineering 27(2019)2491–2497
[6] Zhaikun, Duvenxia, Lufeng, Chongtao, Xiyuan, an improved dynamic nuclear principal component analysis fault detection method, chemical science report 2019,70(2):716-
[7] Zhujinlin, data-driven industrial process robust supervision, doctor academic thesis, Zhejiang university, 2016
[8]T Yie Yu.A nonlinear kernel Gaussian mixture model based inferential monitoring approach for fault detection and diagnosis of chemical processes.Chemical Engineering Science,68(2012) 506-519
[9]Yuan-Jui Liu,Tao Chen,Yuan Yao.Nonlinear process monitoring and fault isolation using extended maximum variance unfolding.Journal of Process Control 24(2014)880–891
[10]Fei He,Jinwu Xu A novel process monitoring and fault detection approach based on statistics locality preserving projections.Journal of Process Control,37:46-57,2016.
[11]Xiaoxia Chen,Chudong Tong,Ting Lan,Lijia Luo.Dynamic process monitoring based on orthogonal dynamic inner neighborhood preserving embedding model.Chemometrics and Intelligent Laboratory Systems,193(2019)103812
[12]Bing Song,Shuai Tan,Hongbo Shi.Time–space locality preserving coordination for multimode process monitoring.Chemometrics and Intelligent Laboratory Systems,15115:190-200,2016
[13]Yue Li,Yijie Zeng,Yuanyuan Qing,Guang-Bin Huang.Learning local discriminative representations via extreme learning machine for machine fault diagnosis.Neurocomputing,4097:275-285,2020.
[14] Baiting, wangsing, waihong, factor analysis monitoring method based on variable probability information, chemical science, 2017,68(7):2844- "2850.
[15]Zhiqiang Ge.Mixture Bayesian Regularization of PCR Model and Soft Sensing Application.IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS,62(7):4336-4343,2015
[16] Dynamic monitoring method of chemical process based on latent variable autoregressive algorithm for Tangjun Miao, Shuhaizhen, Shixuhua and Tongdong, chemical industry bulletin, 2019,70(3):987-
[17]GE Z Q,SONG Z H.Maximum-likelihood mixture factor analysis model and its application for process monitoring[J].Chemometrics&Intelligent Laboratory Systems,2010,102(1):53-61.
[18]Weiming Shao,Zhiqiang Ge,Le Yao,Zhihuan Song.Bayesian Nonlinear Gaussian Mixture Regression and its Application to Virtual Sensing for Multimode Industrial Processes.IEEE Transactions on Automation Science and Engineering,17(2):423-437,2020
[19] Zhengjunhua industrial process data hidden variable regression modeling and application, doctor academic thesis, Zhejiang university, 2017
[20] Information fusion algorithm under the Filter framework of Markuo, Wuhang Student's t Zhejiang university newspaper (engineering edition), 54(3) 581-
[21]Weiming Shao,Zhiqiang Ge,Zhihuan Song,et al.Semisupervised Robust Modeling of Multimode Industrial Processes for Quality Variable Prediction Based on Student’s t Mixture Model.IEEE Transactions on Industrial informatics,16(5):2965-2976,2020.
[22]Jaakko Luttinen,Alexander Ilin,Juha T Karhunen.Bayesian Robust PCA of Incomplete Data.Neural Processing Letters,36(2):189-202,2012
[23] Chenjiayi, zhao zhong gai, liu fei, robust PPLS model and its application in process monitoring, chemical industry, 67 (7): 2907-2915, 2016.
[24] The method comprises the steps of hole jade, reconstruction, Luo Jia Yu and the like, a quality-related fault detection method based on local information increment and MPLS, control and decision https:// doi.org/10.13195/j.kzyjc.2019.1402.
Disclosure of Invention
The invention aims to solve the technical problem of providing a Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data aiming at the defects of the background technology, modeling each data vector noise by using independent student distribution, improving the flexibility of modeling by utilizing the distribution to contain an adjustable robust degree of freedom parameter, and estimating posterior distribution by using a Bayesian variational inference method. The model not only makes full use of the marked data and the unmarked data, but also reconstructs original data by using the pollution-free data elements as much as possible, reduces the influence of the polluted elements on the reconstructed data, solves the problems of data loss and wild points, has good robustness, improves the precision of the model, and is beneficial to improving the monitoring performance of the industrial process and the understanding and cognition level of the process operation.
The invention adopts the following technical scheme for solving the technical problems:
a Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data comprises a Bayesian semi-supervised robust PPLS model of incomplete data and model parameter learning of Bayesian variational inference; the method specifically comprises the following steps;
step 1, initializing prior distribution parameters and hidden variable distribution hyperparameters;
step 2, determining initial model parameters and hidden variable parameters by using a PPLS (point-to-multipoint localization and localization) method based on an EM (effective distance) algorithm according to a training data set;
step 3, calculating posterior distribution q (delta) of the hidden variables and updating distribution parameters;
step 4, solving an optimization problem to obtain an optimal prior hyperparameter v;
step 5, calculating the variation lower bound of the log-likelihood function according to the approximate posterior distribution;
step 6, judging whether a convergence condition is met, if so, predicting quality data corresponding to the unknown data sample to realize quality soft measurement; otherwise, returning to the step 3.
As a preferred scheme of the Bayes semi-supervised robust PPLS soft measurement method based on incomplete data, in step 6, the model convergence condition is that likelihood function L (q (delta), theta) change is smaller than a predetermined threshold value to determine thr, namely that
|L(q(Δ(t+1)),Θ(t+1))-L(q(Δ(t)),Θ(t))|<thr
Wherein, L (q (. DELTA.. DELTA.))(t)),Θ(t)) And L (q (. DELTA.))(t+1)),Θ(t+1)) Respectively, the values at the t-th and t + 1-th iterations, and a threshold thr is set to 10-5
As an optimal scheme of the Bayes semi-supervised robust PPLS soft measurement method based on incomplete data, the Bayes semi-supervised robust PPLS model of the incomplete data specifically comprises the following steps;
giving output of marked samples
Figure BDA0002864125690000051
And input
Figure BDA0002864125690000052
Label-free sample
Figure BDA0002864125690000053
And satisfies N ═ NL+NuObserving noise and process noise are subjected to independent t-distribution, wherein N is the amount of marked sample data, and PLS input data and PLS output data are sharedShared hidden variables
Figure BDA0002864125690000054
The probabilistic PLS model can be expressed as
Figure BDA0002864125690000055
Wherein the content of the first and second substances,
Figure BDA0002864125690000056
and
Figure BDA0002864125690000057
in order to be a weight matrix, the weight matrix,
Figure BDA0002864125690000058
to share hidden variables, muxAnd muyMean vectors of the process variable and the observation variable respectively; x and y are column vectors with dimensions D and E, process data noise
Figure BDA0002864125690000059
vx=[vx,1,vx,2,…,vx,D],τx=[τx,1x,2,…,τx,D](ii) a Observation data noise epsilonyAlso obey an independent t-distribution, the form of which is the same as above; the t-distribution has the following form
Figure BDA0002864125690000061
Wherein the content of the first and second substances,
Figure BDA0002864125690000062
in order to be a function of the Gamma function,
Figure BDA0002864125690000063
representing a Gamma distribution with a shape parameter a ' and an inverse scale parameter b ', v ' being a degree of freedom. As can be seen from the above equation, the t-distribution can be interpreted as an infinite number of Gaussian distributionsMixing; for the latent variable t, the a priori of the model parameters P, C, μ and the noise level τ is similar to PPCA in the form
p(t)=N(t|0,IM) (3)
Figure BDA0002864125690000064
Figure BDA0002864125690000065
Figure BDA0002864125690000066
Wherein τ represents τxAnd τyAnd μ represents μxAnd muyAnd beta represents betaxAnd betay(ii) a The parameter tau is used for P, C, discussion of mu prior, and the prior of the noise parameter tau is a Gamma distribution which is independent of each other, i.e.
p(τ)=∏Ga(τd|aτ,bτ)
Parameter α ═ αxy]And a priori of β is
Figure BDA0002864125690000067
p(β)=Ga(β|aβ,bβ)
Figure BDA0002864125690000068
In the simulation, to obtain a wider distribution, the hyper-parameter is set to aτ=bτ=aα=bα=aβ=bβ=10-5For each isotropic noise, τ can be setm=τ;
Order to
Figure BDA0002864125690000069
Figure BDA00028641256900000610
Figure BDA00028641256900000611
If z isnThe elements are observed independently, so that the independent contaminating element zdnIf the assumption is true, an independent student t-distribution pair ε may be usednEach element of (a) is modeled; the likelihood function for the labeled sample is
Figure BDA00028641256900000612
Wherein O represents zdnObservable indications dn set, wdIs the D-th row vector of the matrix W, D is 1,2, …, D + E, N is 1,2, …, NLFor unlabeled samples
Figure BDA0002864125690000075
Corresponding likelihood function of
Figure BDA0002864125690000071
Wherein O' represents an unlabeled sample XuIndicates the set of d 'n', wd'Is equivalent to pd',d'=1,2,…,D,n'=NL+1,NL+2,…,N,μ1:DThe component vector from element 1 to element D representing the vector mu is mux
Introducing hidden variables U and U', constructing student t-distribution by using Gaussian distribution hierarchy, and considering likelihood functions of all marked samples and unmarked samples
Figure BDA0002864125690000072
Wherein, W1:D,:Representing momentsThe matrix formed by row vectors of 1 to D rows of the matrix is the matrix P.
As an optimal scheme of the Bayes semi-supervised robust PPLS soft measurement method based on incomplete data, the Bayes variational inference model parameter learning specifically comprises the following steps;
step 2.1 Bayesian variational reasoning;
and 2.1, learning posterior distribution parameters of Bayesian variational inference.
As the optimal scheme of the Bayes semi-supervised robust PPLS soft measurement method based on incomplete data, the Bayes variational inference specifically comprises the following steps;
the bayesian variational reasoning principle is that given a training data set Ω and the model parameters to be optimized and hidden variables Δ ═ { T, μ, W, τ, U, α, β, Θ }, the true posterior distribution p (Δ | Ω) and the arbitrary form probability distribution q (Δ) about Δ can be decomposed into log-likelihood functions lnp (Ω)
lnp(Ω)=L(q)+KL(q||p)
Where l (q) ═ q (Δ) ln (p (Δ, Ω)/q (Δ)) d Δ, KL (q | | p) ═ q (Δ) ln (q (Δ)/p (Δ | Ω)) d Δ represents Kullback-leibler (KL) divergence. Since KL (q | | p) ≧ 0, lnp (Ω) ≧ l (q), which maximizes lnp (Ω) is equivalent to maximizing l (q), the approximation of q (Δ) to p (Δ | Ω) is achieved by optimizing q (Δ) such that KL (q | | p) is 0, which takes the form of an optimization problem of
Figure BDA0002864125690000073
Suppose q (Δ) can be decomposed into products of respective optimized parameter distributions
Figure BDA0002864125690000074
Obtaining an optimal approximate distribution q byii) I.e. by
Figure BDA0002864125690000081
Wherein,-ΔiRemoving Δ for ΔiThe latter set of optimization parameters.
Based on Bayes variational inference theory, according to model probability structure diagram and likelihood function, combining posterior probability distribution function as
Figure BDA0002864125690000082
Wherein, according to the principle of conditional independence, p (W, μ, τ | α, β) is p (W | τ, α) p (μ | τ, β) p (τ);
according to the mean field theory, the posterior probabilities of hidden variables can be respectively
p(W,T,μ,τ,U,α,β)≈q(T)q(μ,W|τ)q(U)q(α)q(β)q(τ) (14)
Let Δ ═ q (T, μ, W, τ, U, α, β, Θ }, q (Δ) ═ q (T) q (μ, W | τ) q (U) q (α) q (β) q (τ), and the lower bound of the variation of the log-likelihood function is defined as the bayesian variation principle
L(q(Δ),Θ)=〈lnp(Δ,Z,Xu|Θ)〉Δ-〈lnq(Δ)〉Δ+const (15)
Here const is treated as a constant as an independent term independent of Δ. Solving for L (q (Δ), Θ) is equivalent to separately deriving all the variational distributions.
As an optimal scheme of the Bayes semi-supervised robust PPLS soft measurement method based on incomplete data, the posterior distribution parameter learning of Bayes variational inference specifically comprises the following steps;
considering q (μ, W | τ, α, β), note that
Figure BDA0002864125690000083
Figure BDA0002864125690000085
The following correlation terms can be obtained by differentiating the variable lower bound pair L (q (Δ), Θ) with q (μ, W | τ) by:
Figure BDA0002864125690000084
Figure BDA0002864125690000091
where, -W represents the remaining optimization parameters of Δ divided by W. TaudIs a random number τd(iii) a desire; the above formula is arranged to obtain
Figure BDA0002864125690000092
Has a mean and a variance of
Figure BDA0002864125690000093
Figure BDA0002864125690000094
O 'when D + D +1, D +2, …, D + E'dAnd (4) space-time collection. Calculated by the same method
Figure BDA0002864125690000095
The mean value and variance are updated in the form of
Figure BDA0002864125690000096
Figure BDA0002864125690000097
Since q (W | τ, α) and q (μ | τ, β) follow a normal distribution, then
Figure BDA0002864125690000098
According to equations (16) - (20), the corresponding mean and variance of equation (21) are expressed as follows
Figure BDA0002864125690000099
Figure BDA00028641256900000910
Obtaining
Figure BDA00028641256900000911
Then, with
Figure BDA00028641256900000912
The related expectation has the following form
Figure BDA00028641256900000913
Figure BDA0002864125690000101
Figure BDA0002864125690000102
Figure BDA0002864125690000103
Figure BDA0002864125690000104
Figure BDA0002864125690000105
Wherein the content of the first and second substances,
Figure BDA0002864125690000106
to pair
Figure BDA0002864125690000107
The corresponding covariance matrix is then used as a basis,
Figure BDA0002864125690000108
is mudThe variance of the corresponding one of the first and second values,
Figure BDA0002864125690000109
to represent
Figure BDA00028641256900001010
And mudCovariance vector between, they can all be derived from the covariance matrix ∑dIs obtained directly in the step (1);
taking into account posterior distribution
Figure BDA00028641256900001011
Derived from q (tau)
Figure BDA00028641256900001012
Further finishing the above formula to obtain
Figure BDA00028641256900001013
Figure BDA00028641256900001014
Figure BDA00028641256900001015
Here, NdLabeled sample x given ddnNumber and unlabeled sample xdnSum of quantities, Nd'Labeling sample output y for a given ddnQuantity, OdAll observable mark samples z ═ n | dn ∈ O }dnIndicating a set of n, homologus O'dAll observable unlabeled samples x are for { n | dn ∈ O' }dnIndicating a set of n;
taking into account posterior distribution
Figure BDA00028641256900001016
The updated form is
Figure BDA0002864125690000111
The parametric form of q (T) can be obtained according to the above formula
Figure BDA0002864125690000112
Figure BDA0002864125690000113
Wherein, O:n={d|dn∈O},O':n={d|dn∈O'};
Note the posterior distribution
Figure BDA0002864125690000114
Derived from q (. alpha.) to obtain
Figure BDA0002864125690000115
By arranging the above formula, the following parameter form of q (alpha) can be easily obtained
Figure BDA0002864125690000116
Figure BDA0002864125690000117
Same posterior distribution
Figure BDA0002864125690000118
Derived from q (beta)
Figure BDA0002864125690000119
The parameter calculation method comprises
Figure BDA0002864125690000121
Figure BDA0002864125690000122
For independent student t-distribution model, posterior distribution
Figure BDA0002864125690000123
For derivation thereof may be
Figure BDA0002864125690000124
The distribution parameter of q (U) can be obtained from the above formula
Figure BDA0002864125690000125
Figure BDA0002864125690000126
Wherein the content of the first and second substances,
Figure BDA0002864125690000127
is calculated in the form of
Figure BDA0002864125690000128
Figure BDA0002864125690000129
Is calculated in the form of
Figure BDA00028641256900001210
For a multidimensional student t-distribution noise model, the text simply refers to the parameter vdAnd udnIs set as vd=v,udn=unThen the method is finished;
hyper-parameter
Figure BDA00028641256900001211
Can be obtained by maximizing the following optimization problem
Figure BDA0002864125690000131
vd(D ═ D +1, D +2, …, D + E) can be obtained by maximizing the following optimization problem
Figure BDA0002864125690000132
Wherein the content of the first and second substances,
Figure BDA0002864125690000133
student t-distribution in d dimension;
after modeling, for unlabeled data xnWith the corresponding hidden variable tnThen the predicted output is
Figure BDA0002864125690000134
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
the invention discloses a Bayesian semi-supervised robust probability PLS (BRPPLS) fault monitoring method for incomplete data, which is different from the existing multivariate student t-distribution PPLS-based modeling method, wherein independent student t-distribution is used for modeling noise of each data vector, and an adjustable robust degree of freedom parameter is included in the t-distribution, so that the flexibility of modeling is improved; solving the estimated posterior distribution by using a Bayesian variational inference method; the model not only makes full use of the marked data and the unmarked data, but also reconstructs original data by using the pollution-free data elements as much as possible, reduces the influence of the polluted elements on the reconstructed data, solves the problems of data loss and wilderness, has good robustness and high precision, and is beneficial to improving the monitoring performance of the industrial process and the understanding and cognition level of the process operation.
Drawings
FIG. 1 is a Bayesian semi-supervised robust PPLS probability structure diagram of incomplete data of the present invention;
fig. 2 is a flow chart of a soft measurement framework proposed by the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data comprises a Bayesian semi-supervised robust PPLS model of incomplete data and model parameter learning of Bayesian variational inference; the method specifically comprises the following steps;
step 1, initializing prior distribution parameters and hidden variable distribution hyperparameters;
step 2, determining initial model parameters and hidden variable parameters by using a PPLS (point-to-multipoint localization and localization) method based on an EM (effective distance) algorithm according to a training data set;
step 3, calculating posterior distribution q (delta) of the hidden variables and updating distribution parameters;
step 4, solving an optimization problem and solving an optimal prior hyperparameter v;
step 5, calculating the variation lower bound of the logarithmic interpretation function according to the approximate posterior distribution;
step 6, judging whether a convergence condition is met, if so, predicting quality data corresponding to the unknown data sample to realize quality soft measurement; otherwise, returning to the step 3.
As shown in fig. 1, a bayesian semi-supervised robust PPLS probabilistic architecture diagram of incomplete data: y in FIG. 1nFor output of the tagged data, xnInput of tagged data and untagged data. t is tnIs ynAnd xnNot only eliminate redundancy and noise in input variables, but also can express xnChange and ynThe nature of the linkage between the changes. Hidden variables alpha, beta and tau are model parameters P, C and mean value mu probability distribution parameters, model parameter oscillation is controlled, and the convergence rate of model training is improved, wherein mu comprises the mean value muxAnd muy. Introducing hidden variable unThe student t-distribution can be converted into Gaussian distribution, and model optimization problem solving is facilitated. Parameter v is hidden variable unThe Gamma distribution parameter improves the description capability of the input variable and the output variable. The variable denoted by the arrow in the figure represents a variable dependent on the back end of the arrow, e.g. the input variable xnDependent parameter P, tn、εxAnd muxThe variable C depends on the implicit variables α and τ. As the student t-distribution is used for modeling the noise, the non-Gaussian noise existing in the actual system is well described, and the modeling precision is improved; the model fully utilizes the information of the marked samples and the unmarked samples, is suitable for the situation that the number of the marked samples is far less than that of the unmarked samples, and enlarges the application range of the method. Missing elements in the data vectors are ignored, and the number vector normal elements are used for reconstructing the missing data, so that the robustness and the performance of the model are improved.
The standard of prior probability parameters is simplified in the figure, and only training data, model parameters, hidden variables and parameters needing to be updated are labeled. Both the full likelihood distribution function and the hidden variable posterior distribution form are obtained according to fig. 1.
Fig. 2 shows a flow chart of implementation of the present invention, in which an initial prior distribution parameter and a latent variable distribution hyper-parameter selection method are discussed in the above description, the prior distribution parameter is determined mainly based on experience, and an initial model parameter and a latent variable parameter are determined according to a training data set by using a PPLS method based on an EM algorithm proposed in text [19 ]. Therefore, the convergence rate of the optimization algorithm is improved, the local extreme points are not easy to trap, and the model performance is improved.
And updating the posterior distribution parameters of the hidden variables according to the parameter updating formulas of the formulas (17) and (42). When updating, the current parameter is to be updated by using the parameter value updated previously.
Solving the optimization problem shown in the formula (43) and the formula (44) to update the values of the t-distribution parameters v of the students corresponding to the marked samples and the unmarked samplesnThe matlab optimization function can be used to solve.
The convergence condition of the model is determined by determining thr according to the likelihood function L (q (Δ), Θ) shown in equation (15) with a variation smaller than a predetermined threshold, that is
|L(q(Δ(t+1)),Θ(t+1))-L(q(Δ(t)),Θ(t))|<thr
Wherein, L (q (. DELTA.. DELTA.))(t)),Θ(t)) And L (q (. DELTA.))(t+1)),Θ(t+1)) Respectively, the values at the t-th iteration and t +1 iteration, and the threshold thr is generally set to 10-5
After the model training is finished, predicting the quality data corresponding to the unknown data sample by using the formula (45) to realize the soft measurement of the quality
Bayesian semi-supervised robust PPLS model for 1 incomplete data
Giving output of marked samples
Figure BDA0002864125690000151
And input
Figure BDA0002864125690000152
Label-free sample
Figure BDA0002864125690000153
Satisfy N ═ NL+NuThe observation noise and the process noise are subjected to independent t-distribution, wherein N is the amount of marked sample data. Sharing hidden variables between PLS input data and output data
Figure BDA0002864125690000154
The probabilistic PLS model can then be expressed as
Figure BDA0002864125690000155
Wherein the content of the first and second substances,
Figure BDA0002864125690000156
and
Figure BDA0002864125690000157
in order to be a weight matrix, the weight matrix,
Figure BDA0002864125690000158
to share hidden variables, muxAnd muyMean vectors of the process variable and the observation variable respectively; x and y are column vectors with dimensions D and E, process data noise
Figure BDA0002864125690000159
vx=[vx,1,vx,2,…,vx,D],τx=[τx,1x,2,…,τx,D](ii) a Observation data noise epsilonyAlso obey an independent t-profile, the form of which is the same as above. the t-distribution has the following form
Figure BDA00028641256900001510
Wherein the content of the first and second substances,
Figure BDA00028641256900001511
in order to be a function of the Gamma function,
Figure BDA00028641256900001512
representing a Gamma distribution with a shape parameter a ' and an inverse scale parameter b ', v ' being a degree of freedom. As can be seen from the above equation, the t-distribution can be interpreted as a mixture of infinite Gaussian distributions. For the latent variable t, the a priori of the model parameters P, C, μ and the noise level τ is similar to PPCA in the form
p(t)=N(t|0,IM) (3)
Figure BDA0002864125690000161
Figure BDA0002864125690000162
Figure BDA0002864125690000163
Where τ denotes τxAnd τyAnd μ represents μxAnd muyAnd beta represents betaxAnd betay. The purpose of applying a priori to P, C is to reduce the risk of over-fitting, helping to actively find the dimensions of the subspace. The parameter τ is used in P, C, and the discussion of μ a priori is set forth in article [ 2]]. Of course a more complex prior could be used for P, C, μ, since we use a simple prior for the model parameters since we focus on the distribution of noise here. The noise parameter τ being a priori a Gamma distribution which is independent of each other, i.e.
p(τ)=∏Ga(τd|aτ,bτ)
Parameter α ═ αx,αy]And a priori of β is
Figure BDA0002864125690000164
p(β)=Ga(β|aβ,bβ)
Figure BDA0002864125690000165
In the simulation, to obtain a wider distribution, the parameter is set to aτ=bτ=aα=bα=aβ=bβ=10-5. For each isotropic noise, τ can be setm=τ。
Order to
Figure BDA0002864125690000166
Figure BDA0002864125690000167
Figure BDA0002864125690000168
As can be seen from the above definition,
Figure BDA0002864125690000169
if z isnThe elements are observed independently, so that the independent contaminating element zdnThe assumption is that, independent student t-distribution pairs ε can be usednEach element of (a) is modeled. Likelihood function for a labeled sample is
Figure BDA00028641256900001610
Wherein O represents zdnObservable indications dn set, wdD being WthA row vector, D1, 2, …, D + E, N1, 2, …, Nl. For unlabeled samples
Figure BDA0002864125690000175
Corresponding likelihood function of
Figure BDA0002864125690000171
Wherein O' represents an unlabeled sample XuCan observeThe elements indicate the set of d 'n', wd'Is equivalent to pd',d'=1,2,…,D,n'=NL+1,NL+2,…,N。μ1:DElement 1 through element D representing vector μ constitute a vector (i.e., μx)。
Introducing hidden variables U and U', the student t-distribution can be constructed by using Gaussian distribution hierarchy, and then the likelihood functions of all marked samples and unmarked samples are considered
Figure BDA0002864125690000172
Wherein, W1:D,:A matrix (actually, matrix P) consisting of row vectors representing 1 to D rows of the matrix. Given the observed data, bayesian inference is based on estimating the posterior distribution of the unknown variables. We use variational Bayesian methods to solve the problem that joint posterior distribution is difficult to handle.
Model parameter learning for Bayesian variational inference
2.1 Bayesian variational inference
The bayesian variational reasoning principle is that given a training data set Ω and the model parameters to be optimized and hidden variables Δ ═ { T, μ, W, τ, U, α, β, Θ }, the true posterior distribution p (Δ | Ω) and the arbitrary form probability distribution q (Δ) about Δ can be decomposed into log-likelihood functions lnp (Ω)
lnp(Ω)=L(q)+KL(q||p) (10)
Where l (q) ═ q (Δ) ln (p (Δ, Ω)/q (Δ)) d Δ, KL (q | | p) ═ q (Δ) ln (q (Δ)/p (Δ | Ω)) d Δ represents Kullback-leibler (KL) divergence. Since KL (q | | p) ≧ 0, lnp (Ω) ≧ l (q), which maximizes lnp (Ω) is equivalent to maximizing l (q), the approximation of q (Δ) to p (Δ | Ω) is achieved by optimizing q (Δ) such that KL (q | | p) is 0, which takes the form of an optimization problem of
Figure BDA0002864125690000173
It is assumed that q (Δ) can be decomposed into products of respective optimized parameter distributions
Figure BDA0002864125690000174
Obtaining an optimal approximate distribution q byii) I.e. by
Figure BDA0002864125690000181
Here, -DeltaiRemoving Δ for ΔiThe latter set of optimization parameters.
Based on Bayes variational inference theory, according to the model probability structure diagram shown in FIG. 1 and the likelihood function shown in equation (9), the combined posterior probability distribution function is
Figure BDA0002864125690000182
Here, according to the principle of conditional independence, p (W, μ, τ | α, β) is p (W | τ, α) p (μ | τ, β) p (τ).
According to the mean field theory, the posterior probabilities of the hidden variables can be respectively
p(W,T,μ,τ,U,α,β)≈q(T)q(μ,W|τ)q(U)q(α)q(β)q(τ) (14)
Let Δ ═ T, μ, W, τ, U, α, β, Θ }, q (Δ) ═ q (T) q (μ, W | τ) q (U) q (α) q (β) q (τ), and the lower bound of the variation of the log-likelihood function be, according to the bayes principle of variation
L(q(Δ),Θ)=〈lnp(Δ,Z,Xu|Θ)〉Δ-〈lnq(Δ)〉Δ+const (15)
Here const is treated as a constant as an independent term independent of Δ. Solving for L (q (Δ), Θ) is equivalent to separately deriving all the variational distributions.
2.1 posterior distribution parameter learning by Bayesian variational inference
Considering q (μ, W | τ, α, β), note that
Figure BDA0002864125690000183
Figure BDA0002864125690000184
The following correlation terms can be obtained by differentiating the variable lower bound pair L (q (Δ), Θ) with q (μ, W | τ) by:
Figure BDA0002864125690000185
Figure BDA0002864125690000191
here, -W represents the remaining optimization parameters of Δ divided by W. TaudIs a random number τdThe expectation is that. The above formula is arranged to obtain
Figure BDA0002864125690000192
Has a mean and a variance of
Figure BDA0002864125690000193
Figure BDA0002864125690000194
It is noted that O 'when D +1, D +2, …, D + E'dAnd (4) space-time collection. Calculated by the same method
Figure BDA0002864125690000195
The mean value and variance are updated in the form of
Figure BDA0002864125690000196
Figure BDA0002864125690000197
Since q (W | τ, α) and q (μ | τ, β) follow a normal distribution, then
Figure BDA0002864125690000198
According to equations (16) - (20), the corresponding mean and variance of equation (21) are expressed as follows
Figure BDA0002864125690000199
Figure BDA00028641256900001910
Obtaining
Figure BDA00028641256900001911
Then, with
Figure BDA00028641256900001912
μdThe related expectation has the following form
Figure BDA0002864125690000201
Figure BDA0002864125690000202
Figure BDA0002864125690000203
Figure BDA0002864125690000204
Figure BDA0002864125690000205
Figure BDA0002864125690000206
Wherein the content of the first and second substances,
Figure BDA0002864125690000207
to pair
Figure BDA0002864125690000208
The corresponding covariance matrix is then used as a basis,
Figure BDA0002864125690000209
is mudThe variance of the corresponding one of the first and second values,
Figure BDA00028641256900002010
to represent
Figure BDA00028641256900002011
And mudCovariance vector between, they can all be derived from the covariance matrix ∑dCan be directly obtained.
Taking into account posterior distribution
Figure BDA00028641256900002012
Derived from q (tau)
Figure BDA00028641256900002013
Further finishing the above formula to obtain
Figure BDA00028641256900002014
Figure BDA00028641256900002015
Figure BDA00028641256900002016
Here, NdLabeled sample x given ddnNumber and unlabeled sample xdnSum of quantities, Nd'Labeling sample output y for a given ddnQuantity, OdAll observable mark samples z ═ n | dn ∈ O }dnIndicating a set of n, homologus O'dAll observable unlabeled samples x are for { n | dn ∈ O' }dnIndicating the set of n.
Taking into account posterior distribution
Figure BDA0002864125690000211
The updated form is
Figure BDA0002864125690000212
The parametric form of q (T) can be obtained according to the above formula
Figure BDA0002864125690000213
Figure BDA0002864125690000214
Here, O:n={d|dn∈O},O':n={d|dn∈O'}。
Note the posterior distribution
Figure BDA0002864125690000215
Derived from q (. alpha.) to obtain
Figure BDA0002864125690000216
By arranging the above formula, the following parameter form of q (alpha) can be easily obtained
Figure BDA0002864125690000217
Figure BDA0002864125690000218
Same posterior distribution
Figure BDA0002864125690000219
Derived from q (beta)
Figure BDA0002864125690000221
The parameter calculation method comprises
Figure BDA0002864125690000222
Figure BDA0002864125690000223
For independent student t-distribution model, posterior distribution
Figure BDA0002864125690000224
For derivation thereof may be
Figure BDA0002864125690000225
The distribution parameter of q (U) can be obtained from the above formula
Figure BDA0002864125690000226
Figure BDA0002864125690000227
Wherein the content of the first and second substances,
Figure BDA0002864125690000228
is calculated byIn the form of
Figure BDA0002864125690000229
Figure BDA00028641256900002210
Is calculated in the form of
Figure BDA0002864125690000231
For a multidimensional student t-distribution noise model, the text simply refers to the parameter vdAnd udnIs set as vd=v,udn=unAnd (4) finishing.
Hyper-parameter
Figure BDA0002864125690000232
Can be obtained by maximizing the following optimization problem
Figure BDA0002864125690000233
vd(D ═ D +1, D +2, …, D + E) can be obtained by maximizing the following optimization problem
Figure BDA0002864125690000234
Wherein the content of the first and second substances,
Figure BDA0002864125690000235
is the d-dimensional student t-distribution.
After modeling, for unlabeled data xnWith the corresponding hidden variable tnThen the predicted output is
Figure BDA0002864125690000236
In implementation, the problem of selecting a hidden space dimension needs to be solved first. One approach is to compute the L2-norm of the side-by-side projection matrix, select the projection dimension with the larger column vector norm, and ignore the dimension with the smaller norm. The other method is to roughly select projection dimensions according to a norm method, and then adopt a cross validation method to select dimensions.
Additionally, initial values of the model parameters need to be determined. C. P, muy、μxAnd the hidden variable initial value problem can use the text [19] for the output quality data and the output data respectively]The PPLS is obtained, and the method has the advantages that the learning effect of the model under the Bayesian variational inference iterative framework is more stable, and the convergence speed is accelerated.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention. While the embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (6)

1. A Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data is characterized in that: the Bayes semi-supervised robust PPLS model containing incomplete data and the model parameter learning of Bayes variational inference are carried out; the method specifically comprises the following steps;
step 1, initializing prior distribution parameters and hidden variable distribution hyperparameters;
step 2, determining initial model parameters and hidden variable parameters by using a PPLS (point-to-multipoint localization and localization) method based on an EM (effective distance) algorithm according to a training data set;
step 3, calculating posterior distribution q (delta) of the hidden variables according to a Bayes variational inference method and updating model parameters and the hidden variables;
step 4, solving an optimization problem and solving an optimal prior hyperparameter v;
step 5, calculating the variation lower bound of the log-likelihood function according to the approximate posterior distribution;
step 6, judging whether a convergence condition is met, if so, predicting quality data corresponding to the unknown data sample, and realizing soft measurement of the quality index; otherwise, returning to the step 3.
2. The Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data as recited in claim 1, wherein: in step 6, the model convergence condition is that the change of the likelihood function L (q (Δ), Θ) is smaller than the predetermined threshold thr, i.e. the model convergence condition is
|L(q(Δ(t+1)),Θ(t+1))-L(q(Δ(t)),Θ(t))|<thr
Wherein, L (q (. DELTA.. DELTA.))(t)),Θ(t)) And L (q (. DELTA.))(t+1)),Θ(t+1)) Respectively representing the values at the t-th and t + 1-th iterations, and a threshold thr set to 10-5
3. The Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data as recited in claim 1, wherein: the Bayes semi-supervised robust PPLS model of incomplete data specifically comprises the following steps;
giving output of marked samples
Figure FDA0002864125680000011
And input
Figure FDA0002864125680000012
Label-free sample
Figure FDA0002864125680000013
And satisfies N ═ NL+NuThe observation noise and the process noise are both subject to independent t-distributions, where N isLAnd NuRespectively representing the sample size of marked and unmarked samples, D and E respectively representing the dimension of sample input and output, and setting the shared hidden variable between PLS input data and output data
Figure FDA0002864125680000014
The probabilistic PLS model can be expressed as
Figure FDA0002864125680000015
Wherein the content of the first and second substances,
Figure FDA0002864125680000016
and
Figure FDA0002864125680000017
in order to be a weight matrix, the weight matrix,
Figure FDA0002864125680000018
to share hidden variables, muxAnd muyMean vectors of the process variable and the observation variable respectively; x and y are column vectors with dimensions D and E respectively, and the process data noise obeys t-distribution
Figure FDA0002864125680000021
vx=[vx,1,vx,2,…,vx,D],τx=[τx,1x,2,…,τx,D](ii) a Observation data noise epsilonyAlso obeys an independent t-distribution, the form of which is the same as above, wherein the student t-distribution has the form
Figure FDA0002864125680000022
Wherein the content of the first and second substances,
Figure FDA0002864125680000023
in order to be a function of the Gamma function,
Figure FDA0002864125680000024
representing a Gamma distribution with a shape parameter a ' and an inverse scale parameter b ', v ' being a degree of freedom; the t-distribution is interpreted as a mixture of infinite gaussian distributions, u' is the noise level of the hidden variable control variable; for the latent variable t, the a priori of the model parameters P, C, μ and the noise level τ is similar to PPCA in the form
p(t)=N(t|0,IM) (3)
Figure FDA0002864125680000025
Figure FDA0002864125680000026
Figure FDA0002864125680000027
Wherein the content of the first and second substances,
Figure FDA0002864125680000028
denotes τxAnd τyThe column vector of the component is composed of,
Figure FDA0002864125680000029
represents μxAnd muyVector of composition, beta denotes betaxAnd betayWhere T represents the transpose of a matrix or vector; the parameter tau is used for P, C, mu a priori, the a priori of the noise level parameter tau is a Gamma distribution,the distribution of each variable being independent of each other, i.e.
Figure FDA00028641256800000210
Parameter α ═ αxy]And a priori of β is
Figure FDA00028641256800000211
p(β)=Ga(β|aβ,bβ)
Figure FDA00028641256800000212
In the simulation, to obtain a wider distribution, the hyper-parameter is set to aτ=bτ=aα=bα=aβ=bβ=10-5For each isotropic noise, all noise level parameters are the same, i.e. τ can be setm=τ;
Order to
Figure FDA00028641256800000213
A matrix representing the composition of the tag data,
Figure FDA00028641256800000214
a sample of the marked data is represented,
Figure FDA00028641256800000215
a matrix of all inputs of labeled and unlabeled sample sets,
Figure FDA00028641256800000216
a matrix of weights is represented by a matrix of weights,
Figure FDA00028641256800000217
indicating the noise of the marked sample,
Figure FDA00028641256800000218
the mean value vector is represented by a mean value vector,
Figure FDA00028641256800000219
a hyper-parametric vector representing the noise distribution,
Figure FDA00028641256800000220
representing the process variable and the observation variable noise level vectors,
Figure FDA0002864125680000031
representing shared hidden variables corresponding to marked samples and unmarked samples,
Figure FDA0002864125680000032
representing hidden variables corresponding to marked samples and unmarked samples,
Figure FDA0002864125680000033
for the n-th marked sample
Figure FDA0002864125680000034
If z isnThe element is observed independently, then its element zdnAre independent of each other, and if this assumption is true, independent student t-distribution pairs z may be usednOf the noise variance epsilonnIs modeled, N is 1,2, …, NLD ═ 1,2, …, D + E; the likelihood function for the labeled sample is
Figure FDA0002864125680000035
Wherein O represents zdnSet of observable indications dn, wdIs the D-th row vector of W, D is 1,2, …, D + E, N is 1,2, …, NLFor not markingRecording sample
Figure FDA0002864125680000036
Corresponding likelihood function of
Figure FDA0002864125680000037
Wherein O' represents an unlabeled sample XuIndicates the set of d 'n', note that here wd'Is equivalent to pd',d'=1,2,…,D,n'=Nl+1,Nl+2,…,N,μ1:DThe 1 st element to the D th element of the expression vector mu form the vector mux
Introducing an implicit variable U, constructing a student t-distribution by using a Gaussian distribution, and considering likelihood functions of all marked samples and unmarked samples
Figure FDA0002864125680000038
Wherein, W1:D,:The matrix formed by row vectors of the 1 st row to the D th row of the matrix W is the matrix P.
4. The Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data as recited in claim 1, wherein: model parameter learning of Bayesian variational inference specifically comprises the following steps;
step 2.1 Bayesian variational reasoning;
and 2.1, learning posterior distribution parameters of Bayesian variational inference.
5. The Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data as recited in claim 4, wherein: bayes variational reasoning, which specifically comprises the following steps;
the bayesian variational reasoning principle is that given a training data set Ω and a posterior distribution p (Δ | Ω) and an arbitrary formal probability distribution q (Δ) about Δ that require optimization of model parameters and hidden variables Δ ═ T, μ, W, τ, U, α, β, Θ, the log-likelihood function lnp (Ω) can be decomposed into
ln p(Ω)=L(q)+KL(q||p) (10)
Where l (q) ═ q (Δ) ln (p (Δ, Ω)/q (Δ)) d Δ, KL (q | | p) ═ q (Δ) ln (q (Δ)/p (Δ | Ω)) d Δ represents Kullback-leibler (KL) divergence. Since KL (q | | p) ≧ 0, lnp (Ω) ≧ L (q). Such maximization lnp (Ω) is equivalent to maximization l (q), and q (Δ) is approximated to p (Δ | Ω) by optimizing q (Δ) such that KL (q | | p) is 0, which is in the form of an optimization problem
Figure FDA0002864125680000041
Suppose q (Δ) can be decomposed into products of respective optimized parameter distributions
Figure FDA0002864125680000042
The optimum approximate distribution q can be obtained byii) I.e. by
Figure FDA0002864125680000043
Wherein, -DeltaiRemoving Δ for ΔiThe latter set of optimization parameters.
Based on Bayes variational inference theory, according to model probability structure diagram and likelihood function, combining posterior probability distribution function as
Figure FDA0002864125680000044
Wherein, according to the principle of conditional independence, p (W, μ, τ | α, β) is p (W | τ, α) p (μ | τ, β) p (τ);
from the mean field theory and the model probability map, the hidden variable posterior probability can be written as
p(W,T,μ,τ,U,α,β)≈q(T)q(μ,W|τ)q(U)q(α)q(β)q(τ) (14)
Let Δ ═ T, μ, W, τ, U, α, β, Θ }, q (Δ) ═ q (T) q (μ, W | τ) q (U) q (α) q (β) q (τ), and the lower bound of the variation of the log-likelihood function be, according to the bayes principle of variation
L(q(Δ),Θ)=<ln p(Δ,Z,Xu|Θ)>Δ-<lnq(Δ)>Δ+const (15)
Here const is treated as a constant as an independent term independent of Δ. Solving for L (q (Δ), Θ) is equivalent to separately deriving all the variational distributions.
6. The Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data as recited in claim 4, wherein: the posterior distribution parameter learning of Bayes variational inference specifically comprises the following steps;
(1) taking into account the posterior distribution q (μ, W | τ, α, β) of the model parameters (μ, W), it is noted that
Figure FDA0002864125680000051
Figure FDA0002864125680000052
The following correlation terms can be obtained by differentiating the variable lower bound pair L (q (Δ), Θ) with q (μ, W | τ) by:
Figure FDA0002864125680000053
wherein, -W represents the remaining optimization parameter of Δ divided by W, OdN | dn ∈ O } represents all observable marker samples z in the set OdnIndicating a set of n, O'dDenotes the set O 'all observable marker samples z [ n | dn ∈ O' ]dnIndicating the set of N, N-1, 2, …, N, D-1, 2, …, D + E.<τd>Representing a random number τd(iii) a desire; the above formula is arranged to obtain
Figure FDA0002864125680000054
Has a mean and a variance of
Figure FDA0002864125680000055
Figure FDA0002864125680000056
O 'when D + D +1, D +2, …, D + E'dAnd (4) space-time collection. Calculated by the same method
Figure FDA0002864125680000057
The mean value and variance are updated in the form of
Figure FDA0002864125680000058
Figure FDA0002864125680000059
Since q (W | τ, α) and q (μ | τ, β) follow a normal distribution, then
Figure FDA0002864125680000061
According to equations (16) - (20), the corresponding mean and variance of equation (21) are expressed as follows
Figure FDA0002864125680000062
Figure FDA0002864125680000063
Obtaining
Figure FDA0002864125680000064
Then, with
Figure FDA0002864125680000065
μdThe related expectation has the following form
Figure FDA0002864125680000066
Figure FDA0002864125680000067
Figure FDA0002864125680000068
Figure FDA0002864125680000069
Figure FDA00028641256800000610
Figure FDA00028641256800000611
Wherein the content of the first and second substances,
Figure FDA00028641256800000612
to pair
Figure FDA00028641256800000613
The corresponding covariance matrix is then used as a basis,
Figure FDA00028641256800000614
is mudThe variance of the corresponding one of the first and second values,
Figure FDA00028641256800000615
to represent
Figure FDA00028641256800000616
And mudCovariance vector between, they can all be derived from the covariance matrix ∑dIs obtained directly in the step (1);
(2) consideration of the posterior distribution of hidden variables
Figure FDA00028641256800000617
Derived from q (tau)
Figure FDA00028641256800000618
Further finishing the above formula to obtain
Figure FDA0002864125680000071
Figure FDA0002864125680000072
Figure FDA0002864125680000073
Here, NdLabeled sample x given ddnNumber and unlabeled sample xdnSum of quantities, Nd'Labeling sample output y for a given ddnQuantity, OdAll observable mark samples z ═ n | dn ∈ O }dnIndicating a set of n, homologus O'dAll observable unlabeled samples x are for { n | dn ∈ O' }dnIndicating a set of n;
(3) consideration of the posterior distribution of hidden variables T
Figure FDA0002864125680000074
The updated form is
Figure FDA0002864125680000075
The parametric form of q (T) can be obtained according to the above formula
Figure FDA0002864125680000076
Figure FDA0002864125680000077
Wherein, O:n={d|dn∈O},O':n={d|dn∈O'};
(4) Note the posterior distribution of hidden variables alpha
Figure FDA0002864125680000078
Derived from q (. alpha.) to obtain
Figure FDA0002864125680000079
By arranging the above formula, the following parameter form of q (alpha) can be easily obtained
Figure FDA0002864125680000081
Figure FDA0002864125680000082
(5) Consideration of the hidden variable beta posterior distribution
Figure FDA0002864125680000083
Derived from q (beta)
Figure FDA0002864125680000084
The parameter calculation method comprises
Figure FDA0002864125680000085
Figure FDA0002864125680000086
(6) For independent student t-distribution model, hidden variable U posterior distribution
Figure FDA0002864125680000087
For derivation thereof may be
Figure FDA0002864125680000088
The distribution parameter of q (U) can be obtained from the above formula
Figure FDA0002864125680000089
Figure FDA00028641256800000810
Wherein the content of the first and second substances,
Figure FDA00028641256800000811
is calculated in the form of
Figure FDA0002864125680000091
Figure FDA0002864125680000092
Is calculated in the form of
Figure FDA0002864125680000093
For a multidimensional student t-distribution noise model, the text simply refers to the parameter vdAnd udnIs set as vd=v,udn=unThen the method is finished;
(7) hyper-parameter
Figure FDA0002864125680000094
Can be obtained by maximizing the following optimization problem
Figure FDA0002864125680000095
vd(D ═ D +1, D +2, …, D + E) can be obtained by maximizing the following optimization problem
Figure FDA0002864125680000096
Wherein the content of the first and second substances,
Figure FDA0002864125680000097
student t-distribution in d dimension;
after modeling, for unlabeled data xnWith the corresponding hidden variable tnThen the predicted output is
Figure FDA0002864125680000098
CN202011576333.0A 2020-09-18 2020-12-28 Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data Pending CN112541558A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010985728 2020-09-18
CN202010985728X 2020-09-18

Publications (1)

Publication Number Publication Date
CN112541558A true CN112541558A (en) 2021-03-23

Family

ID=75017665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011576333.0A Pending CN112541558A (en) 2020-09-18 2020-12-28 Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data

Country Status (1)

Country Link
CN (1) CN112541558A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113050602A (en) * 2021-03-26 2021-06-29 杭州电子科技大学 Industrial process fault method based on robust semi-supervised discriminant analysis

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542126A (en) * 2011-10-10 2012-07-04 上海交通大学 Soft measurement method based on half supervision learning
CN103279030A (en) * 2013-03-07 2013-09-04 清华大学 Bayesian framework-based dynamic soft measurement modeling method and device
US20170061305A1 (en) * 2015-08-28 2017-03-02 Jiangnan University Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression
CN110083065A (en) * 2019-05-21 2019-08-02 浙江大学 A kind of adaptive soft-sensor method having supervision factorial analysis based on streaming variation Bayes
EP3620983A1 (en) * 2018-09-05 2020-03-11 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
CN111142501A (en) * 2019-12-27 2020-05-12 浙江科技学院 Fault detection method based on semi-supervised autoregressive dynamic hidden variable model
US10678196B1 (en) * 2020-01-27 2020-06-09 King Abdulaziz University Soft sensing of a nonlinear and multimode processes based on semi-supervised weighted Gaussian regression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542126A (en) * 2011-10-10 2012-07-04 上海交通大学 Soft measurement method based on half supervision learning
CN103279030A (en) * 2013-03-07 2013-09-04 清华大学 Bayesian framework-based dynamic soft measurement modeling method and device
US20170061305A1 (en) * 2015-08-28 2017-03-02 Jiangnan University Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression
EP3620983A1 (en) * 2018-09-05 2020-03-11 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for data analysis
CN110083065A (en) * 2019-05-21 2019-08-02 浙江大学 A kind of adaptive soft-sensor method having supervision factorial analysis based on streaming variation Bayes
CN111142501A (en) * 2019-12-27 2020-05-12 浙江科技学院 Fault detection method based on semi-supervised autoregressive dynamic hidden variable model
US10678196B1 (en) * 2020-01-27 2020-06-09 King Abdulaziz University Soft sensing of a nonlinear and multimode processes based on semi-supervised weighted Gaussian regression

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUNHUA ZHENG等: "Semisupervised learning for probabilistic partial least squares regression model and soft sensor application", 《JOURNAL OF PROCESS CONTROL》, vol. 64, 7 March 2018 (2018-03-07), pages 123 - 131 *
刘紫薇: "基于贝叶斯网络的软测量建模", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, no. 8, 15 August 2019 (2019-08-15), pages 140 - 101 *
郑俊华: "工业过程数据隐变量回归建模及应用", 《中国博士学位论文全文数据库 (基础科学辑)》, no. 8, 15 August 2018 (2018-08-15), pages 002 - 14 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113050602A (en) * 2021-03-26 2021-06-29 杭州电子科技大学 Industrial process fault method based on robust semi-supervised discriminant analysis
CN113050602B (en) * 2021-03-26 2022-08-09 杭州电子科技大学 Industrial process fault classification method based on robust semi-supervised discriminant analysis

Similar Documents

Publication Publication Date Title
Grbić et al. Adaptive soft sensor for online prediction and process monitoring based on a mixture of Gaussian process models
Buizza et al. Data learning: Integrating data assimilation and machine learning
Rezende et al. Stochastic backpropagation and variational inference in deep latent gaussian models
Dubois et al. Data-driven predictions of the Lorenz system
Silverman et al. Bayesian multinomial logistic normal models through marginally latent matrix-T processes
Baldacchino et al. Variational Bayesian mixture of experts models and sensitivity analysis for nonlinear dynamical systems
Zhou et al. Surrogate modeling of high-dimensional problems via data-driven polynomial chaos expansions and sparse partial least square
Guo et al. A just-in-time modeling approach for multimode soft sensor based on Gaussian mixture variational autoencoder
O’Reilly et al. Univariate and multivariate time series manifold learning
Bühlmann Causal statistical inference in high dimensions
Luo et al. Bayesian deep learning with hierarchical prior: Predictions from limited and noisy data
Chen et al. Low-rank autoregressive tensor completion for multivariate time series forecasting
Xiong et al. Soft sensor modeling with a selective updating strategy for Gaussian process regression based on probabilistic principle component analysis
Delasalles et al. Spatio-temporal neural networks for space-time data modeling and relation discovery
Guo et al. A mutual information-based Variational Autoencoder for robust JIT soft sensing with abnormal observations
Kidd et al. Bayesian nonstationary and nonparametric covariance estimation for large spatial data (with discussion)
Mora et al. Probabilistic neural data fusion for learning from an arbitrary number of multi-fidelity data sets
CN112541558A (en) Bayesian semi-supervised robust PPLS soft measurement method based on incomplete data
Williams et al. Sensing with shallow recurrent decoder networks
Cailliez et al. Bayesian calibration of force fields for molecular simulations
Pilosov et al. Parameter estimation with maximal updated densities
Ren et al. Tuning-free heterogeneous inference in massive networks
CN114757245A (en) Multi-mode monitoring method for semi-supervised identification mixed probability principal component analysis
Tiomoko et al. Deciphering lasso-based classification through a large dimensional analysis of the iterative soft-thresholding algorithm
Glaude et al. Subspace identification for predictive state representation by nuclear norm minimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination