CN114936528A - Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition - Google Patents

Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition Download PDF

Info

Publication number
CN114936528A
CN114936528A CN202210632112.3A CN202210632112A CN114936528A CN 114936528 A CN114936528 A CN 114936528A CN 202210632112 A CN202210632112 A CN 202210632112A CN 114936528 A CN114936528 A CN 114936528A
Authority
CN
China
Prior art keywords
matrix
equation
variable
formula
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210632112.3A
Other languages
Chinese (zh)
Inventor
王平
李雪静
尹贻超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN202210632112.3A priority Critical patent/CN114936528A/en
Publication of CN114936528A publication Critical patent/CN114936528A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a variable weighting self-adaptive local composition extreme learning machine semi-supervised soft measurement modeling method, which constructs a neighbor graph in a self-adaptive manner by comprehensively utilizing weighted Euclidean distance information of a data input space and a prediction output space to realize accurate approximation of potential structure information of data; meanwhile, considering that different auxiliary variables have different degrees of contribution to accurate estimation of the dominant variable, different weights are given to different auxiliary variables through variable weighting learning, and the adverse effects of redundant variables and noise on composition and regression learning are reduced; and finally, integrating variable weighting, self-adaptive composition and extreme learning machine modeling in a unified learning frame, and solving by adopting alternative iterative optimization to obtain an overall optimal solution of modeling learning. Therefore, the semi-supervised learning framework provided by the invention can fully utilize the supervision information contained in the label data, and assist the structural information contained in the label-free data to improve the performance of the extreme learning machine model, thereby achieving the purpose of improving the generalization capability and reliability of the soft measurement model.

Description

Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition
Technical Field
The invention belongs to the technical field of industrial process detection, relates to an industrial process soft measurement technology, and particularly relates to a variable weighting self-adaptive local composition-based extreme learning machine semi-supervised soft measurement modeling method.
Background
The modern industrial production process is rapidly developing towards digitization and intellectualization, and meanwhile, the pursuit of product quality control is higher and higher. For this reason, actual production plants are usually equipped with a large number of industrial sensors for measuring in real time the operating parameters reflecting the operating conditions of the process, providing the necessary feedback information for the realization of a closed-loop optimal control of the product quality. However, there are some important parameters in industrial process, such as product concentration, components and physical parameters, which are closely related to product quality but difficult to realize direct measurement. At present, the key quality related parameters can only be obtained by means of offline sampling and then sending the parameters to a laboratory for assay and analysis, and the problems of long measurement period, large feedback lag, high manpower and material resource cost and the like are solved. Therefore, when the process condition changes, an operator cannot timely master the real running state of the process, and is difficult to give correct adjustment countermeasures, so that the production efficiency is reduced, and even the operation safety is endangered. Soft measurement techniques have been developed in the context of indirect estimation of key quality related parameters by establishing a mathematical model between easily measurable process variables (also known as auxiliary variables) and difficult to directly measure quality related variables (also known as dominant variables). Compared with laboratory test analysis or on-line component instruments, the instrument has the obvious advantages of low application and maintenance cost, timely response and the like, and therefore, the instrument is widely applied to a plurality of industrial fields of oil refining, chemical engineering, metallurgy, pharmacy and the like.
The key of the soft measurement technology is to establish a mathematical model capable of accurately describing potential functional relationships between the auxiliary variables and the main variables. If the production process is deeply known and the knowledge of relevant fields is enriched, a mechanism modeling method can be adopted to establish a soft measurement model. However, the complexity and uncertainty of modern industrial processes greatly limit the scope of application of the mechanism modeling approach. Therefore, the regression modeling method based on data driving has more advantages in universality and flexibility because the regression modeling method does not depend on specific field professional knowledge, and is widely applied to the field of soft measurement in recent decades. Representative techniques include principal component regression, partial least squares, artificial neural networks, and support vector machines. In particular, with the advent of the big data era, artificial neural network algorithms based on deep learning, such as convolutional neural networks, cyclic neural networks, and autoencoders, have become a research hotspot in the field of soft measurement modeling and have achieved a series of remarkable results in recent years.
The performance of data-driven based soft-measurement modeling approaches depends to a large extent on the quantity and quality of the training data. Specifically, to obtain a soft-metric model with a high generalization capability, it is necessary to train with a large number of input-output datasets covering the main operating conditions of the process. In particular, the method is especially suitable for machine learning models with complex structures, such as artificial neural networks, and numerous adjustable parameters. However, for practical soft-metrology modeling problems, the sampling rate of the dominant variable (corresponding to the output variable of the soft-metrology model) is typically much lower than the sampling frequency of the auxiliary variable (corresponding to the input variable of the soft-metrology model). This results in only a small portion of the training data actually collected having both input and output values, while the vast majority of the data has only input values, with corresponding output values being missing. In the field of machine learning, data in which both inputs and outputs have values are referred to as labeled data, and data in which only inputs have values are referred to as unlabeled data. At present, a data-driven soft measurement modeling method mostly adopts a supervised learning mode, namely, only label data is used for modeling, but the effect of label-free data is ignored, under the condition that label samples are scarce, a model overfitting phenomenon easily occurs, the generalization capability and the reliability of the model cannot be guaranteed, and the requirements of practical application cannot be met. Actually, the unlabeled data contains abundant data structure information, and a series of researches show that the performance of the regression model can be remarkably improved by reasonably utilizing the information contained in the unlabeled data. Therefore, more and more attention is paid to establishing a soft measurement mathematical model by using a semi-supervised learning mode, namely, a small amount of label data and a large amount of label-free data.
According to different label-free data utilization modes, the existing semi-supervised soft measurement modeling method can be divided into probability generation (Probabilistic generation), Self-Training (Self-Training), Co-Training (Co-Training), and Manifold Regularization (MR). The MR establishes the relation between the unlabeled data and the labeled data by constructing a neighbor graph based on the assumption that similar inputs correspond to similar outputs and the local smoothness assumption, and provides an effective mechanism for popularizing and applying the supervised learning model to the semi-supervised learning scene. For example, Huang et al (2014) constructs a graph regularization constraint term by a k-nearest neighbor method based on label data and label-free data, and adds the constraint term to an Extreme Learning Machine (ELM) optimization objective function to obtain a semi-supervised ELM algorithm. Similarly, Yan et al (2016) and Zhao et al (2020) introduce the MR term into the gaussian process regression model and the width learning neural network model, respectively, resulting in corresponding semi-supervised learning algorithms. The research shows that the MR-based semi-supervised learning framework can simultaneously utilize the supervision/identification information provided by the label data and the structural information contained in the label-free data to improve the generalization capability and reliability of the model, has a concise description form and a solid theoretical basis, and is successfully applied in the field of soft measurement modeling.
It is worth noting that an important premise for achieving performance improvement of the MR-based semi-supervised learning method is that the constructed neighbor graph can achieve accurate approximation of potential local prevalence structures of data. Considering that the data prevalence structure is unknown in advance and problem-related, many methods for constructing a neighbor graph, such as k-neighbor method, local linear representation, sparse self-representation, low-rank self-representation, etc., have been proposed and successfully applied in a plurality of different research fields. However, most of the existing methods adopt an unsupervised learning mode to construct a neighbor graph offline in an original high-dimensional input space of data. This may cause the following two problems: (1) redundant auxiliary variables and noise inevitably exist in actual modeling data, and in the composition process, redundant information can seriously influence the calculation of the similarity between data, so that each node in the constructed neighbor graph is connected in error. (2) The existing method generally adopts an off-line composition mode, composition and subsequent regression modeling learning are independently completed as two independent learning tasks, and the internal relation between the composition and the regression learning is neglected, so that supervision information provided by a label sample cannot be effectively utilized during composition, and the problem that the constructed neighbor graph is not adaptive to the subsequent regression modeling task is caused.
In conclusion, the existing MR-based semi-supervised learning method is adopted to solve the outstanding problems of weak model generalization capability, poor reliability and the like easily occurring when the actual soft measurement modeling problem is solved. The main reason is that the existing method neglects the inevitable internal relation between composition and regression modeling learning, so that the structure and parameters of the constructed graph cannot accurately describe the potential structural information of the data, and the purpose of improving the model performance by using the label-free sample cannot be achieved.
Disclosure of Invention
Aiming at the key problem of disjointed composition and regression modeling existing in the existing semi-supervised soft measurement modeling technology based on popular regularization, the invention provides a semi-supervised soft measurement modeling method of an extreme learning machine based on variable weighting self-adaptive local composition, which closely links variable weighting, self-adaptive composition and extreme learning machine modeling and forms a unified optimized learning framework for joint solution. Specifically, the method constructs a neighbor graph in a self-adaptive manner by comprehensively utilizing weighted Euclidean distance information of a data input space and a prediction output space to realize accurate approximation of potential structure information of the data; meanwhile, considering that different auxiliary variables have different contribution degrees to accurate estimation of the dominant variable, different weights are given to the different auxiliary variables through variable weighting learning, and the adverse effects of redundant variables and noise on composition and regression modeling are reduced; and finally, integrating variable weighting, self-adaptive composition and extreme learning machine modeling in a unified optimization frame, and adopting alternate iterative solution to achieve the overall optimization of modeling learning. Therefore, the semi-supervised learning framework provided by the invention can fully utilize the supervision information contained in the label data and assist the structural information contained in the label-free data, thereby achieving the purpose of improving the generalization capability and reliability of the soft measurement model.
In order to achieve the aim, the invention provides a variable weighting self-adaptive local composition-based extreme learning machine semi-supervised soft measurement modeling method, which comprises the following steps of:
an off-line modeling stage: collecting assay analysis values y of leading variables i And auxiliary variable measured values corresponding thereto
Figure BDA0003680380850000031
Where i is 1,2, …, n l ,n l D is the dimension of the auxiliary variable, and is the number of the collected main variable values; additional Collection of n u Measured value of an auxiliary variable
Figure BDA0003680380850000032
j=n l +1,n l +2,…,n l +n u Definition n ═ n l +n u The number of the collected auxiliary variable values; sequencing the collected auxiliary variable values in rows to obtain an auxiliary variable data matrix
Figure BDA0003680380850000033
The superscript T represents matrix transposition operation, and accordingly, the collected dominant variable values are sorted by rows to obtain dominant variable data row vectors
Figure BDA0003680380850000034
Further, define n u All 0 row vector
Figure BDA0003680380850000035
Will y l And y u Are combined into a row vector
Figure BDA0003680380850000036
By using X 0 Mean value of (X) 0 ) Sum mean square deviation std (X) 0 ) To X 0 Is subjected to standardization treatment to obtain
Figure BDA0003680380850000037
By y 0 Mean of (y) 0 ) And the mean square error std (y) 0 ) To y 0 Is subjected to standardization treatment to obtain
Figure BDA0003680380850000038
Obtaining distance of extreme learning machineLine training data X, y;
(II) appointing the number of hidden layer neurons of the extreme learning machine as n h Regularization parameters beta, lambda, mu and theta and maximum iteration times max _ iteration, and initializing a variable weighting matrix
Figure BDA0003680380850000039
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the nearest k samples
Figure BDA00036803808500000310
(III) randomly generating a weight matrix between the input layer and the hidden layer of the extreme learning machine
Figure BDA0003680380850000041
And a matrix of bias terms
Figure BDA0003680380850000042
Computing hidden layer outputs using activation functions
Figure BDA0003680380850000043
(IV) updating the weights between hidden layer and output layer
Figure BDA0003680380850000044
Bias b and model predictive tag value
Figure BDA0003680380850000045
(V) Using the similarity matrix
Figure BDA0003680380850000046
Updating a variable weighting matrix
Figure BDA0003680380850000047
(VI) updating the similarity matrix
Figure BDA0003680380850000048
(VII) repeating the step (IV), (V) and (VI) until the maximum iteration number max _ iteration is reached;
(eight) an online testing stage: collecting test data X new Using training data X 0 Mean value of (X) 0 ) And mean square deviation std (X) 0 ) For test data X new Carrying out standardization processing to obtain standardized test data
Figure BDA0003680380850000049
Predicting results for ELM
Figure BDA00036803808500000410
Performing anti-standardization to obtain X new Corresponding estimated value
Figure BDA00036803808500000411
Further, in the step (a), training data is first utilized
Figure BDA00036803808500000412
Mean value of (X) 0 ) And mean square deviation std (X) 0 ) Training data X by equation (1) 0 The normalization process is performed, and the expression of formula (1) is:
Figure BDA00036803808500000413
in the formula, mean (-) represents the mean value of each column of the calculation matrix, std (-) represents the mean square error of each column of the calculation matrix, and normalized training data are obtained
Figure BDA00036803808500000414
To pair
Figure BDA00036803808500000415
A similar normalization process is also required as in equation (2):
Figure BDA00036803808500000416
further, in the step (two), the variable weighting matrix is initialized
Figure BDA00036803808500000417
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the nearest k samples
Figure BDA00036803808500000418
The specific steps are as follows;
first, the variable weighting matrix M is initialized by equation (3), where equation (3) is expressed as:
Figure BDA00036803808500000419
wherein M is an element on the diagonal
Figure BDA00036803808500000420
The remaining elements are a diagonal matrix of 0 s,
Figure BDA00036803808500000421
next, an initial laplacian matrix L is calculated by formula (4) -formula (7), which is performed as follows:
Figure BDA00036803808500000422
Figure BDA00036803808500000423
Figure BDA0003680380850000051
Figure BDA0003680380850000052
in the formula, the first step is that,
Figure BDA0003680380850000053
D ii for the ith element on the diagonal of the diagonal matrix D, L is the laplacian matrix corresponding to the dataset X, and then equation (4) is solved according to equation (8) -equation (12):
Figure BDA0003680380850000054
Figure BDA0003680380850000055
equation (9) is written as lagrangian equation (10) by defining two lagrangian multipliers. For equation (10) with respect to s i Obtaining a partial derivative of 0
Figure BDA0003680380850000056
The optimal solution obtained according to the KTT condition is shown in formula (11),
Figure BDA0003680380850000057
Figure BDA0003680380850000058
wherein
Figure BDA0003680380850000059
Wherein (·) + When the value in the parenthesis is greater than 0, the value itself is taken, and when the value is less than or equal to 0,0 is taken, and the formula (12) is used to obtain s which is sparsely expressed by only k nonzero elements i
Figure BDA00036803808500000510
Equation (12) is further reduced to equation (13):
Figure BDA00036803808500000511
since γ is related to k, k is an integer and 0 ≦ k ≦ n, the parameter γ may be expressed as equation (14):
Figure BDA00036803808500000512
substituting η and γ into equation (11) to obtain:
Figure BDA00036803808500000513
further, in the step (three), a weight matrix between the input layer and the hidden layer of the extreme learning machine is randomly generated
Figure BDA0003680380850000061
And a matrix of bias terms
Figure BDA0003680380850000062
And (3) mapping the data X by a sigmod function to form a hidden layer output matrix of the extreme learning machine by using a formula (16)
Figure BDA0003680380850000063
The expression of formula (16) is:
Figure BDA0003680380850000064
wherein, W in For an input weight matrix randomly generated in the (-1, 1) range, B in Representing a matrix of randomly generated bias terms. Outputting hidden layer H 0 And the input data X are combined according to the rows to obtain an augmented data matrix H ═ H 0 ,X];
Further, in the step (IV), the weight between the hidden layer and the output layer is updated
Figure BDA0003680380850000065
Bias b and model predictive tag value
Figure BDA0003680380850000066
Firstly, organically integrating an extreme learning machine, variable weighting and self-adaptive local composition into a unified optimization objective function, wherein a minimized objective function is shown as a formula (17):
Figure BDA0003680380850000067
Figure BDA0003680380850000068
wherein Tr (-) represents a matrix tracing operation,
Figure BDA0003680380850000069
is represented by 2 Norm squared, 1 denotes a column vector with all elements 1, and the diagonal matrix U ═ diag (β, β, … β,0,0, …,0) ∈ R n×n I.e. give n l The label values are given certain weights of beta, lambda, mu, beta and theta as given regularization parameters, and
Figure BDA00036803808500000610
w, b and f are weight, bias and model prediction tag values between the hidden layer and the output layer respectively;
then, the similarity matrix is fixed
Figure BDA00036803808500000611
Variable weighting matrix
Figure BDA00036803808500000612
Get the optimization problem description about w, b and f as publicThe formula (18) shows, so that the analytic expression of w, b can be obtained as shown in the formula (19):
Figure BDA00036803808500000613
Figure BDA00036803808500000614
Figure BDA00036803808500000615
in formula (19), a ═ λ (λ HH) C H T +I q×q ) -1 H T H C Wherein, let q be d + n h
Figure BDA00036803808500000616
I denotes an identity matrix, 1 denotes a column vector with all elements 1, and the objective function Hw +1 in equation (18) is substituted with equation (19) n×1 b yields equation (20):
Figure BDA0003680380850000071
wherein the content of the first and second substances,
Figure BDA0003680380850000072
finally, from equation (19) and equation (20), the optimization problem in equation (18) can be transformed into equation (21):
Figure BDA0003680380850000073
the partial derivative of f is calculated and made to be 0, and the analytical expression of f can be obtained as shown in formula (22),
f=(U+L+μλH C -μλ 2 N) -1 Uy (22)
wherein
Figure BDA0003680380850000074
X C =HH C
Further, in the step (five), a similarity matrix is used
Figure BDA0003680380850000075
Updating a variable weighting matrix
Figure BDA0003680380850000076
The specific steps are formula (23) -formula (25):
first, the output is fixed
Figure BDA0003680380850000077
The similarity matrix S, the objective function is simplified from equation (17) to equation (23):
Figure BDA0003680380850000078
then, by solving equation (23), the updated equation (24) of the variable weighting matrix M can be obtained:
Figure BDA0003680380850000079
wherein, t i =z ii ,Z=X T LX,z ii Is the element on the ith main diagonal of the matrix Z;
further, in the step (six), the similarity matrix is updated
Figure BDA00036803808500000710
Formula (25) to formula (27):
first, the output is fixed
Figure BDA00036803808500000711
Variable weighting matrix
Figure BDA00036803808500000712
The objective function is simplified from equation (17) to equation (25):
Figure BDA00036803808500000713
equation (25) can then be further reduced to solve the optimization problem as shown in equation (26):
Figure BDA00036803808500000714
finally, obtaining an updating formula (27) of the similarity matrix S according to the principle in the step (II),
Figure BDA00036803808500000715
in the formula (27), the first and second groups,
Figure BDA0003680380850000081
further, in the step (seven), the step (four), (five), (six) are repeated until the maximum iteration number max _ iteration is reached.
Further, in the step (eight), the online test stage includes the specific steps of:
first, for the collected n t Test data of
Figure BDA0003680380850000082
Using training data
Figure BDA0003680380850000083
Mean value of (X) 0 ) And mean square deviation std (X) 0 ) Test data X by equation (28) new The normalization process is performed, and the expression of formula (28) is:
Figure BDA0003680380850000084
then, based on the standardized test data
Figure BDA0003680380850000085
Calculating the output value of the test data by the formula (29) and the formula (30)
Figure BDA0003680380850000086
The formula (29) and the formula (30) are respectively expressed as:
H t0 =X test W in +B in (29)
Figure BDA0003680380850000087
wherein the content of the first and second substances,
Figure BDA0003680380850000088
for hidden layer output, hidden layer output H t0 And input data X test Merging by rows to obtain an augmented data matrix
Figure BDA0003680380850000089
Finally predicting result y of ELM test Performing anti-standardization to obtain X new Corresponding estimated value
Figure BDA00036803808500000810
As shown in equation (31):
y new =y test ×std(y 0 )+mean(y 0 ) (31)
compared with the prior art, the invention has the advantages and positive effects that:
the semi-supervised soft measurement modeling method of the extreme learning machine based on the variable weighting self-adaptive local composition provided by the invention closely links the variable weighting, the self-adaptive composition and the extreme learning machine modeling, and forms a unified optimized learning framework for joint solution. Specifically, on one hand, the method disclosed by the invention realizes accurate approximation of the potential structure information of the data by comprehensively utilizing weighted Euclidean distance information of a data input space and a prediction output space to construct a neighbor map in a self-adaptive manner. On the other hand, considering that different auxiliary variables have different contribution degrees to accurate estimation of the dominant variable, different weights are given to the different auxiliary variables through variable weighting learning, and the adverse effects of redundant variables and noise on composition and regression modeling are reduced. Compared with other existing algorithms, the method integrates variable weighting, self-adaptive composition and extreme learning machine modeling into a unified optimization frame, and adopts alternate iterative solution to achieve integral optimization of modeling learning. The method can fully utilize the supervision information contained in the label data and assist the structural information contained in the label-free data, thereby achieving the purpose of improving the generalization capability and reliability of the soft measurement model.
Drawings
FIG. 1 is a flow chart of a variable weighting adaptive local composition-based extreme learning machine semi-supervised soft measurement modeling method according to the present invention;
FIG. 2 is a schematic diagram of a debutanizer process according to an embodiment of the present invention;
FIG. 3 is a graph of the impact of different regularization parameters on the decision coefficients of a training set and a test set under a semi-supervised extreme learning machine model;
FIG. 4 is a graph of the effect of different regularization parameters on the decision coefficients of a training set and a test set under a semi-supervised extreme learning machine model of an adaptive local composition;
FIG. 5 is a graph of the effect of different regularization parameters on the decision coefficients of a training set and a test set under a semi-supervised extreme learning machine model of variable weighted adaptive local patterning;
FIG. 6 is a test set coefficient change diagram of the three models under optimal parameters;
Detailed Description
The invention is described in detail below by way of exemplary embodiments. It should be understood, however, that elements, structures and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
Referring to fig. 1, the invention discloses an extreme learning machine semi-supervised soft measurement modeling method based on variable weighting adaptive local composition, which comprises the following steps:
an off-line modeling stage: collecting assay analysis values y of leading variables i And auxiliary variable measured values corresponding thereto
Figure BDA0003680380850000091
Where i is 1,2, …, n l ,n l D is the dimension of the auxiliary variable, and is the number of the collected main variable values; additional Collection of n u A measured value of an auxiliary variable
Figure BDA0003680380850000092
j=n l +1,n l +2,…,n l +n u Definition n ═ n l +n u The number of the collected auxiliary variable values; sequencing the collected auxiliary variable values in rows to obtain an auxiliary variable data matrix
Figure BDA0003680380850000093
The superscript T represents matrix transposition operation, and accordingly, the collected dominant variable values are sorted according to rows to obtain the row vectors of the dominant variable data
Figure BDA0003680380850000094
Further, define n u All 0 row vector
Figure BDA0003680380850000095
Will y l And y u Are combined into a row vector
Figure BDA0003680380850000096
By using X 0 Mean value of (X) 0 ) Sum mean square deviation std (X) 0 ) To X 0 Is standardized to obtain
Figure BDA0003680380850000097
By y 0 Mean of (y) 0 ) And the mean square error std (y) 0 ) For y 0 Is subjected to standardization treatment to obtain
Figure BDA0003680380850000098
Obtaining off-line training data X, y of the extreme learning machine, and the method comprises the following specific steps:
using training data
Figure BDA0003680380850000099
Mean value of (X) 0 ) And mean square deviation std (X) 0 ) Training data X by equation (1) 0 The normalization process is performed, and the expression of formula (1) is:
Figure BDA00036803808500000910
in the formula (1), mean (-) represents the mean value of each column of the calculation matrix, std (-) represents the mean square error of each column of the calculation matrix, and normalized training data are obtained
Figure BDA00036803808500000911
To pair
Figure BDA00036803808500000912
A similar normalization process is also required as in equation (2):
Figure BDA0003680380850000101
(II) appointing the number of hidden layer neurons of the extreme learning machine as n h Regularization parameters beta, lambda, mu, theta and maximum iteration times max _ iteration, initializing a variable weighting matrix
Figure BDA0003680380850000102
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the latest k samples
Figure BDA0003680380850000103
The specific process comprises the following steps:
first, a variable weighting matrix M is initialized by equation (3), where equation (3) is expressed as:
Figure BDA0003680380850000104
wherein M is an element on the diagonal
Figure BDA0003680380850000105
The remaining elements are a diagonal matrix of 0 s,
Figure BDA0003680380850000106
next, an initial laplacian matrix L is calculated by formula (4) -formula (7), which is as follows:
Figure BDA0003680380850000107
Figure BDA0003680380850000108
Figure BDA0003680380850000109
Figure BDA00036803808500001010
in the formula, the first step is that,
Figure BDA00036803808500001011
D ii for the ith element on the diagonal of the diagonal matrix D, L is the laplacian matrix corresponding to the dataset X, and then equation (4) is solved according to equation (8) -equation (12):
Figure BDA00036803808500001012
Figure BDA00036803808500001013
equation (9) is written as lagrangian equation (10) by defining two lagrangian multipliers. For equation (10) with respect to s i Obtaining a partial derivative of 0
Figure BDA00036803808500001014
Obtaining an optimal solution according to the KTT condition as shown in formula (11),
Figure BDA00036803808500001015
Figure BDA00036803808500001016
wherein
Figure BDA00036803808500001017
Wherein (·) + When the value in the parenthesis is greater than 0, the value itself is taken, and when the value is less than or equal to 0,0 is taken, and the formula (12) is used to obtain s which is sparsely expressed by only k nonzero elements i
Figure BDA0003680380850000111
Equation (12) is further reduced to equation (13):
Figure BDA0003680380850000112
since γ is related to k, k is an integer and 0 ≦ k ≦ n, the parameter γ may be expressed as equation (14):
Figure BDA0003680380850000113
substituting η and γ into equation (11) to obtain:
Figure BDA0003680380850000114
(III) randomly generating a weight matrix between the input layer and the hidden layer of the extreme learning machine
Figure BDA0003680380850000115
And a matrix of bias terms
Figure BDA0003680380850000116
And computing hidden layer outputs using activation functions
Figure BDA0003680380850000117
The method comprises the following specific steps:
firstly, randomly generating a weight matrix W between an input layer and a hidden layer of the extreme learning machine in a range of (-1, 1) in And bias term matrix B in
Then, calculating an extreme learning machine hidden layer output matrix by using the sigmoid function and the data X
Figure BDA0003680380850000118
The expression of the sigmoid function is shown in formula (16):
Figure BDA0003680380850000119
finally, the hidden layer is output H 0 And the input data X are combined according to the rows to obtain an augmented data matrix H ═ H 0 ,X];
(IV) updating the weights between hidden layer and output layer
Figure BDA00036803808500001110
Bias b and model predictive tag value
Figure BDA00036803808500001111
The specific process comprises the following steps:
firstly, organically integrating an extreme learning machine, variable weighting and self-adaptive local composition into a unified optimization objective function, and minimizing an optimization problem shown as an equation (17):
Figure BDA00036803808500001112
Figure BDA00036803808500001113
wherein Tr (-) represents a matrix tracing operation,
Figure BDA00036803808500001114
represents l 2 The square of the norm, 1, represents a column vector with all elements 1, and the diagonal matrix U ═ diag (β, β, … β,0,0, …,0) ∈ R n×n I.e. give n l The label values are given certain weights of beta, lambda, mu, beta and theta as given regularization parameters, and
Figure BDA0003680380850000121
w, b and f are weight, bias and model prediction tag values between the hidden layer and the output layer respectively;
then, the similarity matrix is fixed
Figure BDA0003680380850000122
Variable weighting matrix
Figure BDA0003680380850000123
Obtaining the optimization problem description about w, b and f is shown in equation (18), and thus an analytical expression of w, b can be obtained as shown in equation (19):
Figure BDA0003680380850000124
Figure BDA0003680380850000125
Figure BDA0003680380850000126
in formula (19), a ═ λ (λ HH) C H T +I q×q ) -1 H T H C Wherein q is d + n h
Figure BDA0003680380850000127
I denotes the identity matrix, 1 denotes the column vector with all elements 1, and the objective function Hw +1 in equation (18) is substituted by equation (19) n×1 b yields equation (20):
Figure BDA0003680380850000128
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003680380850000129
finally, from equation (19) and equation (20), the optimization problem in equation (18) can be converted to equation (21):
Figure BDA00036803808500001210
the partial derivative of f is calculated and made to be 0, and the analytical expression of f can be obtained as shown in formula (22),
f=(U+L+μλH C -μλ 2 N) -1 Uy (22)
wherein
Figure BDA00036803808500001211
(V) Using the similarity matrix
Figure BDA00036803808500001212
Updating a variable weighting matrix
Figure BDA00036803808500001213
The specific process comprises the following steps:
first, the output is fixed
Figure BDA00036803808500001214
The similarity matrix S, the objective function, is simplified from equation (17) to equation (23):
Figure BDA00036803808500001215
then, by solving equation (23), the updated equation (24) of the variable weighting matrix M can be obtained:
Figure BDA0003680380850000131
wherein, t i =z ii ,Z=X T LX,z ii Is the element on the ith main diagonal of the matrix Z;
(VI) updating the similarity matrix
Figure BDA0003680380850000132
The specific process comprises the following steps:
first, the output is fixed
Figure BDA0003680380850000133
Variable weighting matrix
Figure BDA0003680380850000134
The objective function is simplified from equation (17) to equation (25):
Figure BDA0003680380850000135
equation (25) can then be further reduced to solve the optimization problem as shown in equation (26):
Figure BDA0003680380850000136
finally, obtaining an updating formula (27) of the similarity matrix S according to the principle in the step (II),
Figure BDA0003680380850000137
in the formula (27), the first and second groups,
Figure BDA0003680380850000138
and (seventhly) repeating the steps (four), (five) and (six) until the maximum iteration number max _ iteration is reached, wherein the specific process is as follows:
and (5) repeating the steps (four), (five) and (six) until the maximum iteration number max _ iteration is reached.
(eight) an online testing stage: for the collected n t A test data
Figure BDA0003680380850000139
Using training data
Figure BDA00036803808500001310
Mean of (X) 0 ) And mean square deviation std (X) 0 ) Test data X by equation (28) new The normalization process is performed, and equation (28) is expressed as:
Figure BDA00036803808500001311
obtaining standardized test data
Figure BDA00036803808500001312
Obtaining a prediction result by using a given model
Figure BDA00036803808500001313
Formula (29) -formula (30):
H t0 =X test W in +B in (29)
Figure BDA00036803808500001314
wherein the content of the first and second substances,
Figure BDA00036803808500001315
for hidden layer output, the hidden layer output H t0 And input data X test Merging by rows to obtain an augmented data matrix
Figure BDA00036803808500001316
Finally predicting the result y of the ELM test Performing anti-standardization to obtain X new Corresponding estimated value
Figure BDA0003680380850000141
As shown in equation (31):
y new =y test ×std(y 0 )+mean(y 0 ) (31)
according to the method provided by the embodiment of the invention, variable weighting, self-adaptive composition and extreme learning machine modeling are integrated in a unified learning frame by the model, and an overall optimal solution of modeling learning is obtained by adopting alternate iterative optimization solution. The method comprehensively utilizes weighted Euclidean distance information of a data input space and a prediction output space to construct a neighbor graph in a self-adaptive mode to achieve accurate approximation of potential structure information of data, a variable weighted self-adaptive local composition extreme learning machine conducts variable weighted learning of input samples on the basis of the self-adaptive local composition extreme learning machine, different weights are given to different auxiliary variables through the variable weighted learning, and therefore the adverse effects of redundant variables and noise on composition and regression learning are reduced. The method improves the performance of the extreme learning machine model by utilizing the supervision information contained in the label data and assisting the structure information contained in the label-free data.
In order to illustrate the effect of the above-mentioned extreme learning machine soft measurement modeling method based on variable weighting adaptive local composition, the present invention is further described below with reference to specific embodiments.
Example (b): the process data for the debutanizer column is taken as an example for illustration.
The debutanizer rectification column is part of a desulfurization and naphtha separator unit, and its main task is to maximize the C5 (stabilized gasoline) content in the debutanizer overhead (liquefied petroleum gas separator feed) and minimize the C4 (butane) content in the debutanizer bottom (Naptha separator feed). A block diagram of which is shown in fig. 2. Besides the rectifying tower (T102), the debutanizer also comprises equipment such as a heat exchanger (E105B), an overhead condenser (E107AB), a bottom reboiler (E108AB), an overhead reflux pump (P102AB), a water feeding pump (P103AB) of an LPG separator and the like. The C5 content in the overhead of the debutanizer column was measured indirectly by an analyzer located at the bottom of the lpg fractionation column of unit number 900. The measurement period of the device was 10 minutes. In addition, the position of the measuring device causes a delay which is not known but is constant, possibly in the range of 20-60 minutes. Similarly, the C4 content in the debutanizer bottom could not be detected directly at the bottom, but was detected by installing a gas chromatograph at the top of the column. The measurement cycle of the apparatus is generally 15 minutes, and also due to the installation position of the analysis instrument, there is a considerable delay in obtaining the concentration values, which is not well known, but is constant and may be in the range of 30-75 minutes. Therefore, in order to realize real-time measurement of butane concentration and improve the control quality of the debutanizer, it is necessary to establish a soft measurement model to estimate the bottom butane concentration in real time. In addition, in consideration of the problems of low sampling efficiency and large time delay of quality variables in the actual production process, it is assumed that only one fifth of all the historical samples have labels (including both input data and output data), and the other historical samples are unlabeled samples (including only input data).
The specific steps of the invention are explained next in connection with the debutanizer production process:
1. an off-line modeling stage: the acquired data is used as a training data set and is preprocessed.
First, for allPreprocessing a sample, and deleting an abnormal sample in the sample; then, considering the dynamic characteristics of the process, performing dimension expansion on all samples, wherein the feature number of the expanded samples is 30; finally, carrying out standardization processing to obtain final training offline training data
Figure BDA0003680380850000151
Sorting the collected quality variable values in rows to obtain quality variable data row vectors
Figure BDA0003680380850000152
Further, 1440 rows of all-0 row vectors are defined
Figure BDA0003680380850000153
Will y l And y u Are combined into a row vector
Figure BDA0003680380850000154
By y 0 Mean of (y) 0 ) And the mean square error std (y) 0 ) To y 0 Is subjected to standardization treatment to obtain
Figure BDA0003680380850000155
Obtaining off-line training data X, y of the extreme learning machine;
2. an initial laplacian matrix is constructed from the training dataset.
Appointing the number of hidden layer neurons of the extreme learning machine to be 5, and the regularization parameters beta, lambda, mu and theta to be 5, 5 and 10 respectively -2 ,10 3 And maximum number of iterations 15, initializing a variable weighting matrix
Figure BDA0003680380850000156
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the latest 3 samples
Figure BDA0003680380850000157
3. Computing hidden layer outputs using activation functions
Figure BDA0003680380850000158
Randomly generating weights and offsets between the input layer and the hidden layer, and calculating the hidden layer output using an activation function
Figure BDA0003680380850000159
4. Updating model parameters
Figure BDA00036803808500001510
Variable weighting matrix
Figure BDA00036803808500001511
Similarity matrix
Figure BDA00036803808500001512
5. Repeating the step 4 until the maximum iteration number 15 is reached;
6. and (3) an online testing stage: collecting test data
Figure BDA00036803808500001513
The number of the dominant variable values collected by the test set is 400, and training data X is utilized 0 Mean value of (X) 0 ) Sum mean square deviation std (X) 0 ) For test data X new Carrying out standardization processing to obtain standardized test data
Figure BDA00036803808500001514
Predicting results for ELM
Figure BDA00036803808500001515
Performing anti-standardization to obtain X new Corresponding estimated value
Figure BDA00036803808500001516
Determining the coefficient (R) using Root Mean Square Error (RMSE) 2 ) And comprehensively evaluating the prediction performance of the soft measurement model by three evaluation indexes of Mean Absolute Error (MAE), wherein the expressions of the three evaluation indexes are as followsEquation (32) -equation (34):
Figure BDA00036803808500001517
Figure BDA00036803808500001518
Figure BDA00036803808500001519
in the formula, y i And
Figure BDA00036803808500001520
respectively the real value and the predicted value of the target variable of the ith sample,
Figure BDA00036803808500001521
is the average of all sample target variables. Determining the coefficient R 2 The reliability of the prediction result can be measured, and the closer the calculation result is to 1, the better the prediction effect of the soft measurement model is. And calculating the prediction error of the soft measurement model by adopting the RMSE and the MAE, wherein the smaller the error value is, the higher the prediction precision of the soft measurement model is.
Table 1 shows the fitting conditions of the conventional semi-supervised extreme learning machine model, the adaptive local patterning extreme learning machine model and the variable weighted adaptive local patterning extreme learning machine model of the present invention to the debutanizer data in 15 simulation experiments under the optimal parameters.
TABLE 1
Figure BDA0003680380850000161
As can be seen from Table 1, the best results are obtained overall by the method of the invention, test set MAE, R 2 The RMSE is improved to some extent.
By combining the analysis, the extreme learning machine model of the variable weighting self-adaptive local composition provided by the invention not only can be used for self-adaptively constructing a neighbor graph by comprehensively utilizing the weighted Euclidean distance information of the data input space and the predicted output space to realize accurate approximation of the data potential structure information, but also can be used for endowing different auxiliary variables with different weights through variable weighting learning, thereby improving the generalization capability and reliability of the model.
The semi-supervised extreme learning machine model, the self-adaptive local composition semi-supervised extreme learning machine model and the influence graphs of the method on the decision coefficients of the debutanizer data on the predicted value and the true value under different regularization parameters are shown in fig. 3, 4 and 5. FIG. 6 is a diagram of coefficient change determined by a test set of three models under optimal parameters, and it can be seen from FIG. 6 that the method of the present invention has higher prediction accuracy compared with the conventional method.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are possible within the spirit and scope of the claims.

Claims (8)

1. A semi-supervised soft measurement modeling method of an extreme learning machine based on variable weighting self-adaptive local composition is characterized by comprising the following specific steps:
an off-line modeling stage: collecting assay analysis values y of leading variables i And auxiliary variable measured values corresponding thereto
Figure FDA0003680380840000011
Where i is 1,2, …, n l ,n l D is the dimension of the auxiliary variable, and is the number of the collected main variable values; additional Collection of n u A measured value of an auxiliary variable
Figure FDA0003680380840000012
j=n l +1,n l +2,…,n l +n u In which n is defined as n l +n u The number of the collected auxiliary variable values; pressing the collected auxiliary variable values into linesSequencing to obtain an auxiliary variable data matrix
Figure FDA0003680380840000013
The superscript T represents matrix transposition operation, and accordingly, the collected dominant variable values are sorted by rows to obtain dominant variable data row vectors
Figure FDA0003680380840000014
Further, define n u All 0 row vector
Figure FDA0003680380840000015
Will y l And y u Are combined into a row vector
Figure FDA0003680380840000016
By using X 0 Mean value of (X) 0 ) And mean square deviation std (X) 0 ) To X 0 Is subjected to standardization treatment to obtain
Figure FDA0003680380840000017
By y 0 Mean of (y) 0 ) And the mean square error std (y) 0 ) To y 0 Is subjected to standardization treatment to obtain
Figure FDA0003680380840000018
Obtaining off-line training data X, y of the extreme learning machine;
(II) appointing the number of hidden layer neurons of the extreme learning machine as n h Regularization parameters beta, lambda, mu and theta and maximum iteration times max _ iteration, and initializing a variable weighting matrix
Figure FDA0003680380840000019
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the nearest k samples
Figure FDA00036803808400000110
(III)Randomly generating weight matrix between input layer and hidden layer of extreme learning machine
Figure FDA00036803808400000111
And a matrix of bias terms
Figure FDA00036803808400000112
Computing hidden layer outputs using activation functions
Figure FDA00036803808400000113
(IV) updating the weights between hidden layer and output layer
Figure FDA00036803808400000114
Bias b and model predictive tag value
Figure FDA00036803808400000115
(V) updating the variable weighting matrix by using the similarity matrix
Figure FDA00036803808400000116
(VI) updating the similarity matrix
Figure FDA00036803808400000117
(seventhly) repeating the steps (four), (five) and (six) until the maximum iteration number max _ iteration is reached;
(eight) an online testing stage: collecting test data
Figure FDA00036803808400000118
n t Using training data X for the number of dominant variable values collected for the test set 0 Mean value of (X) 0 ) And mean square deviation std (X) 0 ) For test data X new Carrying out standardization processing to obtain standardized test data
Figure FDA00036803808400000119
Predicting results for ELM
Figure FDA00036803808400000120
Performing anti-standardization to obtain X new Corresponding estimated value
Figure FDA00036803808400000121
2. The extreme learning machine semi-supervised soft measurement modeling method based on variable weighting adaptive local composition as recited in claim 1, wherein in the step (one), training data is firstly utilized
Figure FDA00036803808400000122
Mean value of (X) 0 ) Sum mean square deviation std (X) 0 ) Training data X by equation (1) 0 The normalization process is performed, and the expression of formula (1) is:
Figure FDA0003680380840000021
in the formula, mean (-) represents the mean value of each column of the calculation matrix, std (-) represents the mean square error of each column of the calculation matrix, and normalized training data are obtained
Figure FDA0003680380840000022
To pair
Figure FDA0003680380840000023
A similar normalization process is also required as in equation (2):
Figure FDA0003680380840000024
3. variable-based weighting as claimed in claim 2The extreme learning machine semi-supervised soft measurement modeling method for the self-adaptive local composition is characterized in that in the step (II), a weighting matrix aiming at the initialized variables is adopted
Figure FDA0003680380840000025
Calculating the distance between each sample and selecting the ith sample x i Constructing an initial Laplace matrix by using the nearest k samples
Figure FDA0003680380840000026
The specific steps are as follows;
first, the variable weighting matrix M is initialized by equation (3), where equation (3) is expressed as:
Figure FDA0003680380840000027
wherein M is an element on the diagonal
Figure FDA0003680380840000028
The remaining elements are a diagonal matrix of 0 s,
Figure FDA0003680380840000029
next, an initial laplacian matrix L is calculated by formula (4) -formula (7), which is as follows:
Figure FDA00036803808400000210
Figure FDA00036803808400000211
Figure FDA00036803808400000212
Figure FDA00036803808400000213
in the above-mentioned formula,
Figure FDA00036803808400000214
D ii for the ith element on the diagonal of the diagonal matrix D, L is the laplacian matrix corresponding to the dataset X, and then equation (4) is solved according to equation (8) -equation (12):
Figure FDA00036803808400000215
Figure FDA00036803808400000216
equation (9) is written as lagrangian equation (10) by defining two lagrangian multipliers. For equation (10) with respect to s i Obtaining a partial derivative of 0
Figure FDA0003680380840000031
The optimal solution obtained according to the KTT condition is shown in formula (11),
Figure FDA0003680380840000032
Figure FDA0003680380840000033
wherein
Figure FDA0003680380840000034
Wherein (·) + When the value in the parentheses is greater than 0, the value itself is taken, and when the value is less than or equal to 0,0 is taken, and the formula is used(12) Obtaining s in a sparse representation of only k non-zero elements i
Figure FDA0003680380840000035
Equation (12) is further reduced to equation (13):
Figure FDA0003680380840000036
since γ is related to k, k is an integer and 0 ≦ k ≦ n, the parameter γ may be expressed as equation (14):
Figure FDA0003680380840000037
substituting η and γ into equation (11) to obtain:
Figure FDA0003680380840000038
4. the extreme learning machine semi-supervised soft measurement modeling method based on variable weighted adaptive local composition as recited in claim 3, wherein in the third step, the weight matrix between the input layer and the hidden layer of the extreme learning machine is randomly generated
Figure FDA0003680380840000039
And a matrix of bias terms
Figure FDA00036803808400000310
And computing hidden layer outputs using activation functions
Figure FDA00036803808400000311
The method comprises the following specific steps:
first, in the (-1, 1) rangeIn the enclosure, randomly generating a weight matrix W between the input layer and the hidden layer of the extreme learning machine in And a matrix of bias terms B in
Then, calculating an extreme learning machine hidden layer output matrix by using the sigmoid function and the data X
Figure FDA00036803808400000312
The expression of the sigmoid function is shown in formula (16):
Figure FDA00036803808400000313
finally, the hidden layer is output H 0 And the input data X are combined according to the rows to obtain an augmented data matrix H ═ H 0 ,X]。
5. The extreme learning machine semi-supervised soft measurement modeling method based on variable weighted adaptive local composition as recited in claim 4, wherein in the step (IV), the weights between the hidden layer and the output layer are updated
Figure FDA0003680380840000041
Bias b and model predictive tag value
Figure FDA0003680380840000042
The method comprises the following specific steps:
firstly, organically integrating an extreme learning machine, variable weighting and self-adaptive local composition into a unified optimization objective function, and minimizing an optimization problem shown as an equation (17):
Figure FDA0003680380840000043
Figure FDA0003680380840000044
wherein Tr (-) represents a matrix tracing operation,
Figure FDA0003680380840000045
represents l 2 Norm squared, 1 denotes a column vector with all elements 1, and the diagonal matrix U ═ diag (β, β, … β,0,0, …,0) ∈ R n×n I.e. give n l The label values are given certain weights of beta, lambda, mu, beta and theta as given regularization parameters, and
Figure FDA0003680380840000046
w, b and f are weight, bias and model prediction tag values between the hidden layer and the output layer respectively;
then, the similarity matrix is fixed
Figure FDA0003680380840000047
Variable weighting matrix
Figure FDA0003680380840000048
Obtaining the optimization problem description about w, b and f is shown in equation (18), and thus an analytical expression of w, b can be obtained as shown in equation (19):
Figure FDA0003680380840000049
Figure FDA00036803808400000410
in the formula (19), a ═ λ (λ HH) C H T +I q×q ) -1 H T H C Wherein, let q be d + n h
Figure FDA00036803808400000411
I denotes an identity matrix, 1 denotes a column vector with all elements 1, and the objective function Hw +1 in equation (18) is substituted with equation (19) n×1 b obtainingEquation (20):
Figure FDA00036803808400000412
in the formula (20)
Figure FDA00036803808400000413
Finally, the optimization problem formula (18) is converted to formula (21) according to formula (19) and formula (20):
Figure FDA0003680380840000051
the partial derivative of f is calculated and made to be 0, the optimal solution is obtained as shown in formula (22),
f=(U+L+μλH C -μλ 2 N) -1 Uy (22)
wherein
Figure FDA0003680380840000052
X C =HH C
6. The extreme learning machine semi-supervised soft measurement modeling method based on variable weighted adaptive local composition as recited in claim 5, wherein in the step (V), a similarity matrix is utilized
Figure FDA0003680380840000053
Updating a variable weighting matrix
Figure FDA0003680380840000054
The method comprises the following specific steps:
first, the output f, the similarity matrix S, is fixed, and the objective function is reduced from equation (17) to equation (23):
Figure FDA0003680380840000055
then, by solving equation (23), the updated equation (24) of the variable weighting matrix M can be obtained:
Figure FDA0003680380840000056
wherein, t i =z ii ,Z=X T LX,z ii Is the element on the ith main diagonal of the matrix Z.
7. The extreme learning machine semi-supervised soft measurement modeling method based on variable weighted adaptive local composition as recited in claim 6, wherein in the sixth step (VI), the similarity matrix is updated
Figure FDA0003680380840000057
The method comprises the following specific steps:
first, the output is fixed
Figure FDA0003680380840000058
Variable weighting matrix
Figure FDA0003680380840000059
The objective function is simplified from equation (17) to equation (25):
Figure FDA00036803808400000510
equation (25) can then be further reduced to solve the optimization problem as shown in equation (26):
Figure FDA00036803808400000511
finally, obtaining an updating formula (27) of the similarity matrix S according to the principle in the step (II),
Figure FDA00036803808400000512
in the formula (27), the first and second groups of the chemical reaction are shown in the specification,
Figure FDA00036803808400000513
8. the extreme learning machine semi-supervised soft measurement modeling method based on variable weighted adaptive local composition as recited in claim 7, wherein in the step (eight), the specific steps in the online testing stage are as follows:
first, for n collected t A test data
Figure FDA0003680380840000061
Using training data
Figure FDA0003680380840000062
Mean value of (X) 0 ) Sum mean square deviation std (X) 0 ) Test data X by equation (28) new The normalization process is performed, and equation (28) is expressed as:
Figure FDA0003680380840000063
then, based on the standardized test data
Figure FDA0003680380840000064
Calculating the output value of the test data by the formula (29) and the formula (30)
Figure FDA0003680380840000065
The formula (29) and the formula (30) are respectively expressed as:
H t0 =X test W in +B in (29)
Figure FDA0003680380840000066
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003680380840000067
for hidden layer output, the hidden layer output H t0 And input data X test Merging by rows to obtain an augmented data matrix
Figure FDA0003680380840000068
Finally, the predicted result y is paired test Performing anti-standardization to obtain X new Corresponding estimated value
Figure FDA0003680380840000069
As shown in equation (31):
y new =y test ×std(y 0 )+mean(y 0 ) (31)。
CN202210632112.3A 2022-06-07 2022-06-07 Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition Pending CN114936528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210632112.3A CN114936528A (en) 2022-06-07 2022-06-07 Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210632112.3A CN114936528A (en) 2022-06-07 2022-06-07 Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition

Publications (1)

Publication Number Publication Date
CN114936528A true CN114936528A (en) 2022-08-23

Family

ID=82867154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210632112.3A Pending CN114936528A (en) 2022-06-07 2022-06-07 Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition

Country Status (1)

Country Link
CN (1) CN114936528A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738866A (en) * 2023-08-11 2023-09-12 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction
CN117272244A (en) * 2023-11-21 2023-12-22 中国石油大学(华东) Soft measurement modeling method integrating feature extraction and self-adaptive composition
EP4379617A1 (en) * 2022-12-01 2024-06-05 Siemens Mobility GmbH Assessment of input-output datasets using neighborhood criteria in input space and output space

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4379617A1 (en) * 2022-12-01 2024-06-05 Siemens Mobility GmbH Assessment of input-output datasets using neighborhood criteria in input space and output space
CN116738866A (en) * 2023-08-11 2023-09-12 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction
CN116738866B (en) * 2023-08-11 2023-10-27 中国石油大学(华东) Instant learning soft measurement modeling method based on time sequence feature extraction
CN117272244A (en) * 2023-11-21 2023-12-22 中国石油大学(华东) Soft measurement modeling method integrating feature extraction and self-adaptive composition
CN117272244B (en) * 2023-11-21 2024-03-15 中国石油大学(华东) Soft measurement modeling method integrating feature extraction and self-adaptive composition

Similar Documents

Publication Publication Date Title
Yuan et al. A dynamic CNN for nonlinear dynamic feature learning in soft sensor modeling of industrial process data
CN114936528A (en) Extreme learning machine semi-supervised soft measurement modeling method based on variable weighting self-adaptive local composition
Yuan et al. Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE
CN112101480B (en) Multivariate clustering and fused time sequence combined prediction method
Li et al. Fault diagnosis for distillation process based on CNN–DAE
CN106156401B (en) Multi-combination classifier based data driving system state model online identification method
CN111444953B (en) Sensor fault monitoring method based on improved particle swarm optimization algorithm
CN110579967B (en) Process monitoring method based on simultaneous dimensionality reduction and dictionary learning
CN109635245A (en) A kind of robust width learning system
CN111126575A (en) Gas sensor array mixed gas detection method and device based on machine learning
CN116448419A (en) Zero sample bearing fault diagnosis method based on depth model high-dimensional parameter multi-target efficient optimization
Tang et al. A new active learning strategy for soft sensor modeling based on feature reconstruction and uncertainty evaluation
CN114297918A (en) Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning
CN112989711B (en) Aureomycin fermentation process soft measurement modeling method based on semi-supervised ensemble learning
de Lima et al. Ensemble deep relevant learning framework for semi-supervised soft sensor modeling of industrial processes
CN114821155A (en) Multi-label classification method and system based on deformable NTS-NET neural network
Yang et al. Domain adaptation network with uncertainty modeling and its application to the online energy consumption prediction of ethylene distillation processes
CN115186584A (en) Width learning semi-supervised soft measurement modeling method integrating attention mechanism and adaptive composition
Yuan et al. Quality prediction modeling for industrial processes using multiscale attention-based convolutional neural network
CN116821695B (en) Semi-supervised neural network soft measurement modeling method
Jakubek et al. Artificial neural networks for fault detection in large-scale data acquisition systems
CN114117852A (en) Regional heat load rolling prediction method based on finite difference working domain division
CN114970698B (en) Metering equipment operation performance prediction method based on improved LWPS
CN112163632A (en) Application of semi-supervised extreme learning machine based on bat algorithm in industrial detection
CN116738866A (en) Instant learning soft measurement modeling method based on time sequence feature extraction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination