CN113570161B

CN113570161B - Method for constructing stirred tank reactant concentration prediction model based on width transfer learning

Info

Publication number: CN113570161B
Application number: CN202110999453.XA
Authority: CN
Inventors: 刘毅; 朱佳良; 贾明伟; 邓鸿英
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-08-29
Filing date: 2021-08-29
Publication date: 2024-05-24
Anticipated expiration: 2041-08-29
Also published as: CN113570161A

Abstract

A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning belongs to the technical field of reactant concentration prediction in a continuous stirred tank reactor. It comprises the following steps: (1) data acquisition and integration; (2) establishing a model; and (3) model prediction. The model constructed by the invention introduces a migration learning method without a target domain label into the field of reactant concentration prediction of the stirred tank, and can effectively solve the problem that the generation process of the continuous stirred tank reactor can not measure enough reactant concentration to evaluate the quality of the product; according to the migration learning method, the concentration of the reactant is selected as the prediction label, so that the problem of domain adaptation of one working condition with the label and the other working condition without the label can be solved, and the prediction performance is improved.

Description

Method for constructing stirred tank reactant concentration prediction model based on width transfer learning

Technical Field

The invention belongs to the technical field of prediction of reactant concentration in a continuous stirred tank reactor, and particularly relates to a method for constructing a stirred tank reactant concentration prediction model based on width transfer learning.

Background

A Continuous Stirred Tank Reactor (CSTR) is a device for performing various physical and chemical reactions in a chemical process, and in the production of three synthetic materials of plastics, chemical fibers and synthetic rubber, the number of CSTRs is about 90% of the total amount of the synthetic production Reactor. In addition, it is also used in a large amount in the industries of pharmacy, paint, fuel, pesticide, etc. The reaction mechanism of the CSTR is complex, and has strong time variability and nonlinearity. According to diversified market demands, production operation conditions, product proportions and the like need to be frequently changed in the production process, so that multiple working conditions in the production process are caused. In the multi-working-condition process, most working conditions can only obtain a small amount of tag data, even no tag data, so that the accurate soft measurement model cannot be directly built by most working conditions to predict the concentration of the reactant.

The transfer learning can establish a soft measurement model between working conditions with different label data amounts, and predict key variables. Common unsupervised migration algorithms include migration principal component analysis (Transfer Componet Analysis, TCA), subspace alignment (Subspace Alignment, SA), geodetic flow kernel methods (Geodesic Flow Kernel, GFK), supervised learning algorithms with domain-adaptive extreme learning (Domain Adaption Extreme LEARNING MACHINE, DAELM), joint distribution adaptation methods (Joint Distribution Adaption, JDA), and balanced distribution adaptation methods (Balanced Distribution Adaption, BDA).

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide a method for constructing a stirred tank reactant concentration prediction model based on a width transfer learning network by taking the width learning network as a framework and introducing manifold regularization construction Laplace matrix.

The invention provides the following technical scheme:

A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning comprises the following steps:

(1) Acquisition and integration of data

Simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q _c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in the corresponding working conditions as data samples, integrates the source domain and the target domain data, and obtains input characteristic data in the width transfer learning network so as to predict tag data in the corresponding working conditions of the target domain;

(2) Model building

The model is based on a width learning network, firstly, feature extraction is carried out on data by using feature nodes in the width learning network, because of randomness of network parameters and deviation of feature extraction, a sparse self-encoder is introduced to adjust input weights, then feature re-extraction is carried out through an enhancement layer, the feature commonly extracted by the feature nodes in the width learning network and the enhancement node is used as a final feature of a hidden layer, popular regularization is added into a loss function, and migration effect is improved by learning manifold features between labeled data of a source domain and label-free data of a target domain, so that the model can be well adapted to label-free target domain data prediction;

(3) Model prediction

And comparing the label data in the working condition corresponding to the target domain with the concentration predicted by the model, and calculating the loss value of the model RMSE to serve as an evaluation standard of the model.

The method for constructing the stirred tank reactant concentration prediction model based on the width transfer learning is characterized in that the process of the step (1) is as follows:

1.1, acquiring data: simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q _c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in corresponding working conditions as data samples;

1.2, integrating auxiliary variable data of a source domain and a target domain: taking data X _s as source domain data, y _s as output label of the source domain, namely reactant concentration, and data X _t as target domain data, enabling X= [ X _s,X_t ], wherein X represents integrated data of the source domain and the target domain, and carrying out standardization processing on the integrated data to eliminate dimensional differences between the source domain data and the target domain data, wherein a specific calculation formula is as follows:

wherein X _new represents input data after normalization processing, X represents unprocessed original data, mu represents the mean value of the original data, and sigma represents the variance of the original data;

1.3, integrating source domain data and target tag data: the source domain label y _s and the zero matrix with the same target domain size of N _t multiplied by 1 form a total label y output by the network, and the total label y is shown as the following formula:

where N _s represents the number of source domain data and N _t represents the number of destination domain data.

The method for constructing the stirred tank reactant concentration prediction model based on the width transfer learning is characterized in that in the step (2), the model predicts the reactant concentration of a target working condition based on a popular regularization frame by transferring between working conditions through a width transfer learning network, and the specific process is as follows:

2.1, extracting characteristics of data by adopting characteristic nodes in a width learning network, wherein the width learning network only has a hidden layer, and the width network consists of the characteristic nodes and enhancement nodes, and the characteristic extraction formula is as follows:

Z_i＝φ_i(XW_ei+β_ei)

Zⁿ＝[Z₁,Z₂...Z_n]

Wherein Z _i represents the features learned by the ith group of feature layers, Z ⁿ represents the total features extracted by all feature groups, phi _i represents the activation function of the ith group of feature layers, W _ei and beta _ei are the weights and offsets randomly generated by the ith group of feature layers, and because the extracted features have randomness, a sparse self-encoder is used for adjusting the weights W _ei to obtain final weights W' _ei, and X is extracted again to obtain new features Z _i;

2.2, the enhancement node performs secondary feature extraction by setting a random orthogonal matrix, wherein the extraction formula is as follows:

Wherein H _j represents the extracted features of each enhanced node ζ _j, Respectively representing an activation function, randomly generated weights and offsets of the enhancement layer;

and 2.3, taking the characteristic extracted by the characteristic node and the enhancement node in the width learning network as the final characteristic of the hidden layer, wherein the formula is as follows:

G＝[Zⁿ|H^m]

Wherein G represents the final feature extracted by the hidden layer, G epsilon R ^N×(nl+m), N represents the number of samples, N represents the number of nodes of each feature group, l represents the number of groups of feature groups, m represents the number of enhancement nodes, H ^m is the total feature extracted by the whole enhancement layer, and H ^m＝[H₁,...,H_m ];

the loss function is expressed as follows:

Where w represents the weight between the hidden layer and the output layer, γ represents the regularization coefficient of the model, Λ _e is obtained by an empirical formula:

2.4, adding the popular regularization into a loss function, improving the migration effect by learning manifold characteristics between the labeled data of the source domain and the unlabeled data of the target domain, wherein an optimization function of manifold regularization is expressed as:

Wherein the method comprises the steps of Representing the predicted value of sample X _i、x_j, w _ij represents the weight between two samples, X represents sample data, y=gw represents the output of the network, D is a diagonal matrix with the element composition on the diagonal:

l represents a graph laplace matrix, which can be calculated from the following formula:

L＝D-A

Wherein the method comprises the steps of Representing a graph connection matrix consisting of w _ij, and the final matrix L is represented by:

the manifold regularization expression is:

M(X)＝tr(w^TG^TLGw)

adding the manifold regularization frame as a minimum loss function into an objective function, wherein the minimum loss function expression of the whole network is as follows:

When the number of neurons of the hidden layer is smaller than the size of the input data amount, the output weight w is:

w＝(γI+G^TΛ_eG+λG^TLG)^-1(G^TΛ_ey)

otherwise, when the number of neurons of the hidden layer is larger than the size of the input data quantity, the output weight w is as follows:

w＝G^T(γI+Λ_eGG^T+λLGG^T)^-1(Λ_ey)。

by adopting the technology, compared with the prior art, the invention has the following beneficial effects:

The model constructed by the invention introduces a migration learning method without a target domain label into the field of reactant concentration prediction of the stirred tank, and can effectively solve the problem that the generation process of the continuous stirred tank reactor can not measure enough reactant concentration to evaluate the quality of the product; according to the migration learning method, the concentration of the reactant is selected as the prediction label, so that the problem of domain adaptation of one working condition with the label and the other working condition without the label can be solved, and the prediction performance is improved.

Drawings

FIG. 1 is a block diagram of a breadth-learning network in accordance with an embodiment of the present invention

FIG. 2 is a graph of the predicted results for conditions 1 through 2 in an embodiment of the present invention;

FIG. 3 is a graph of the predicted results for operating conditions 2 through 1 in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and examples of the present invention. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

On the contrary, the invention is intended to cover any alternatives, modifications, equivalents, and variations as may be included within the spirit and scope of the invention as defined by the appended claims. Further, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. The present invention will be fully understood by those skilled in the art without the details described herein.

Referring to fig. 1-3, the method for constructing the stirred tank reactant concentration prediction model based on width transfer learning comprises the following steps:

(1) Acquisition and integration of data

Data X _s is taken as source domain data, and consists of auxiliary variables, and y _s represents the output label of the source domain, namely the reactant concentration. Data X _t is also composed of auxiliary variables as target domain data. Let x= [ X _s,X_t ], X denote data after two fields are integrated, and since there is a difference in dimension between the source domain data and the target domain data, the data needs to be normalized, and a specific calculation formula is as follows:

Where X _new denotes input data after normalization processing, X denotes raw data that has not been processed, μ denotes a mean value of the raw data, and σ denotes a variance of the raw data.

The total label y output by the network is composed of a source domain label y _s and a zero matrix with the same target domain size of N _t multiplied by 1.

(2) And (3) establishing a model:

The model is based on a width learning network, the width learning network is a random vector function connecting network, and the width learning is a common neural network, so that the characteristic learning capacity of deep learning can be reserved, and the characteristic of efficient learning of the random network can be realized. According to the method, the reactant concentration is selected as a key variable, and the reactant concentration of a target working condition is predicted by migration between working conditions through a width learning network based on a popular regularization frame.

The width learning network has only one hidden layer, and the width network consists of characteristic nodes and enhancement nodes.

Z_i＝φ_i(XW_ei+β_ei)

Zⁿ＝[Z₁,Z₂...Z_n]

Where Z _i represents the features learned by the ith set of feature layers and Z ⁿ represents the total features extracted by all feature sets. Phi _i represents the activation function of the ith set of feature layers, W _ei and beta _ei are randomly generated weights and offsets of the ith set of feature layers, and the extracted features are random, so that the weight W _ei is adjusted by using a sparse self-encoder to obtain a final weight W' _ei. And for the ith group of feature layers, carrying out feature extraction on X again by using the adjusted weight W' _ei to obtain a new feature Z _i.

And then the enhancement node performs secondary feature extraction by setting a random orthogonal matrix.

Wherein H _j represents the feature extracted by the j-th enhancement node, ζ _j,Representing the activation function, randomly generated weights and offsets, respectively, of the jth enhanced node.

H^m＝[H₁,...,H_m]

Wherein H ^m is the total feature extracted by the enhancement node, and m represents the number of the enhancement nodes. The total feature extracted by the feature node and the enhancement node is G.

G＝[Zⁿ|H^m]

Wherein G ε R ^N×(nl+m), where N represents the number of samples, N represents the number of nodes per feature group, l represents the number of groups of feature groups, and m represents the number of enhancement nodes. W then represents the weight between the hidden layer to the output layer, the loss function is as follows:

Where γ represents the regularization coefficient of the model, Λ _e is derived from the empirical formula:

Popular regularization can explore manifold features in the data, so that potential information of the data is revealed, and adaptability of the model is improved. In this context, popular regularization is added to the loss function to improve migration by learning manifold features between source domain tagged data and target domain untagged data.

Manifold regularization constructs structural features of data through neighbor graphs, for any point in the data, a neighbor graph can be constructed through k neighbors, each edge is assigned, a Gaussian kernel function is adopted for calculation of a weight w _ij among samples, and an optimization function of manifold regularization can be expressed as:

L＝D-A

Wherein the method comprises the steps of Represents a graph connection matrix, consisting of w _ij. The final matrix L is represented by:

the manifold regularization expression is:

M(X)＝tr(w^TG^TLGw)

the manifold regularization framework is added to the objective function as a minimum loss function, so the final minimum loss function expression of the whole network is:

When the number of neurons of the hidden layer is smaller than the size of the input data volume, the output weight w is:

w＝(γI+G^TΛ_eG+λG^TLG)^-1(G^TΛ_ey)

otherwise, when the number of neurons of the hidden layer is larger than the size of the input data, the output weight w is:

w＝G^T(γI+Λ_eGG^T+λLGG^T)^-1(Λ_ey)

(3) Model prediction:

For the proposed model, the Root Mean Square Error (RMSE) is used as an evaluation index, and the calculation formula is as follows:

wherein, Representing the real data, y _i represents the output of the model, and k represents the number of samples of the test set. In general, a smaller RMSE means that the model's predicted value is closer to the true value, and the model's predicted effect is better.

The advantages of the method of the invention are proved by comparing the prediction results of Regularized Extreme Learning Machine (RELM) based on the labeled data of the source domain, and the prediction results are as follows:

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning is characterized by comprising the following steps: the method comprises the following steps:

(1) Acquisition and integration of data

(2) Model building

(3) Model prediction

Comparing the label data in the working condition corresponding to the target domain with the concentration predicted by the model, and calculating a model RMSE loss value as an evaluation standard of the model;

In the step (2), the model predicts the reactant concentration of the target working condition by the migration between working conditions through the width migration learning network based on the popular regularization frame, and the specific process is as follows:

Z_i＝φ_i(XW_ei+β_ei)

Zⁿ＝[Z₁,Z₂...Z_n]

Wherein Z _i represents the features learned by the ith group of feature layers, Z ⁿ represents the total features extracted by all feature groups, phi _i represents the activation function of the ith group of feature layers, W _ei and beta _ei are the weights and offsets randomly generated by the ith group of feature layers, and because the extracted features have randomness, a sparse self-encoder is used for adjusting the weights W _ei to obtain a final weight W _e'_i, and feature extraction is carried out on X again to obtain new features Z _i;

G＝[Zⁿ|H^m]

the loss function is expressed as follows:

L＝D-A

the manifold regularization expression is:

M(X)＝tr(w^TG^TLGw)

w＝(γI+G^TΛ_eG+λG^TLG)^-1(G^TΛ_ey)

w＝G^T(γI+Λ_eGG^T+λLGG^T)^-1(Λ_ey)。

2. the method for constructing a predictive model of reactant concentration in a stirred tank reactor based on width transfer learning according to claim 1, wherein the process of the step (1) is as follows: