CN113570161B - Method for constructing stirred tank reactant concentration prediction model based on width transfer learning - Google Patents
Method for constructing stirred tank reactant concentration prediction model based on width transfer learning Download PDFInfo
- Publication number
- CN113570161B CN113570161B CN202110999453.XA CN202110999453A CN113570161B CN 113570161 B CN113570161 B CN 113570161B CN 202110999453 A CN202110999453 A CN 202110999453A CN 113570161 B CN113570161 B CN 113570161B
- Authority
- CN
- China
- Prior art keywords
- data
- feature
- model
- domain
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 239000000376 reactant Substances 0.000 title claims abstract description 31
- 238000013526 transfer learning Methods 0.000 title claims abstract description 15
- 238000013508 migration Methods 0.000 claims abstract description 14
- 230000005012 migration Effects 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims abstract description 7
- 230000010354 integration Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 28
- 239000011159 matrix material Substances 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 15
- 230000004913 activation Effects 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 238000001816 cooling Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000006978 adaptation Effects 0.000 abstract description 4
- 238000004519 manufacturing process Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000001311 chemical methods and process Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000003973 paint Substances 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 229920003051 synthetic elastomer Polymers 0.000 description 1
- 229920002994 synthetic fiber Polymers 0.000 description 1
- 239000005061 synthetic rubber Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/061—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Neurology (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Operations Research (AREA)
- Tourism & Hospitality (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning belongs to the technical field of reactant concentration prediction in a continuous stirred tank reactor. It comprises the following steps: (1) data acquisition and integration; (2) establishing a model; and (3) model prediction. The model constructed by the invention introduces a migration learning method without a target domain label into the field of reactant concentration prediction of the stirred tank, and can effectively solve the problem that the generation process of the continuous stirred tank reactor can not measure enough reactant concentration to evaluate the quality of the product; according to the migration learning method, the concentration of the reactant is selected as the prediction label, so that the problem of domain adaptation of one working condition with the label and the other working condition without the label can be solved, and the prediction performance is improved.
Description
Technical Field
The invention belongs to the technical field of prediction of reactant concentration in a continuous stirred tank reactor, and particularly relates to a method for constructing a stirred tank reactant concentration prediction model based on width transfer learning.
Background
A Continuous Stirred Tank Reactor (CSTR) is a device for performing various physical and chemical reactions in a chemical process, and in the production of three synthetic materials of plastics, chemical fibers and synthetic rubber, the number of CSTRs is about 90% of the total amount of the synthetic production Reactor. In addition, it is also used in a large amount in the industries of pharmacy, paint, fuel, pesticide, etc. The reaction mechanism of the CSTR is complex, and has strong time variability and nonlinearity. According to diversified market demands, production operation conditions, product proportions and the like need to be frequently changed in the production process, so that multiple working conditions in the production process are caused. In the multi-working-condition process, most working conditions can only obtain a small amount of tag data, even no tag data, so that the accurate soft measurement model cannot be directly built by most working conditions to predict the concentration of the reactant.
The transfer learning can establish a soft measurement model between working conditions with different label data amounts, and predict key variables. Common unsupervised migration algorithms include migration principal component analysis (Transfer Componet Analysis, TCA), subspace alignment (Subspace Alignment, SA), geodetic flow kernel methods (Geodesic Flow Kernel, GFK), supervised learning algorithms with domain-adaptive extreme learning (Domain Adaption Extreme LEARNING MACHINE, DAELM), joint distribution adaptation methods (Joint Distribution Adaption, JDA), and balanced distribution adaptation methods (Balanced Distribution Adaption, BDA).
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide a method for constructing a stirred tank reactant concentration prediction model based on a width transfer learning network by taking the width learning network as a framework and introducing manifold regularization construction Laplace matrix.
The invention provides the following technical scheme:
A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning comprises the following steps:
(1) Acquisition and integration of data
Simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in the corresponding working conditions as data samples, integrates the source domain and the target domain data, and obtains input characteristic data in the width transfer learning network so as to predict tag data in the corresponding working conditions of the target domain;
(2) Model building
The model is based on a width learning network, firstly, feature extraction is carried out on data by using feature nodes in the width learning network, because of randomness of network parameters and deviation of feature extraction, a sparse self-encoder is introduced to adjust input weights, then feature re-extraction is carried out through an enhancement layer, the feature commonly extracted by the feature nodes in the width learning network and the enhancement node is used as a final feature of a hidden layer, popular regularization is added into a loss function, and migration effect is improved by learning manifold features between labeled data of a source domain and label-free data of a target domain, so that the model can be well adapted to label-free target domain data prediction;
(3) Model prediction
And comparing the label data in the working condition corresponding to the target domain with the concentration predicted by the model, and calculating the loss value of the model RMSE to serve as an evaluation standard of the model.
The method for constructing the stirred tank reactant concentration prediction model based on the width transfer learning is characterized in that the process of the step (1) is as follows:
1.1, acquiring data: simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in corresponding working conditions as data samples;
1.2, integrating auxiliary variable data of a source domain and a target domain: taking data X s as source domain data, y s as output label of the source domain, namely reactant concentration, and data X t as target domain data, enabling X= [ X s,Xt ], wherein X represents integrated data of the source domain and the target domain, and carrying out standardization processing on the integrated data to eliminate dimensional differences between the source domain data and the target domain data, wherein a specific calculation formula is as follows:
wherein X new represents input data after normalization processing, X represents unprocessed original data, mu represents the mean value of the original data, and sigma represents the variance of the original data;
1.3, integrating source domain data and target tag data: the source domain label y s and the zero matrix with the same target domain size of N t multiplied by 1 form a total label y output by the network, and the total label y is shown as the following formula:
where N s represents the number of source domain data and N t represents the number of destination domain data.
The method for constructing the stirred tank reactant concentration prediction model based on the width transfer learning is characterized in that in the step (2), the model predicts the reactant concentration of a target working condition based on a popular regularization frame by transferring between working conditions through a width transfer learning network, and the specific process is as follows:
2.1, extracting characteristics of data by adopting characteristic nodes in a width learning network, wherein the width learning network only has a hidden layer, and the width network consists of the characteristic nodes and enhancement nodes, and the characteristic extraction formula is as follows:
Zi=φi(XWei+βei)
Zn=[Z1,Z2...Zn]
Wherein Z i represents the features learned by the ith group of feature layers, Z n represents the total features extracted by all feature groups, phi i represents the activation function of the ith group of feature layers, W ei and beta ei are the weights and offsets randomly generated by the ith group of feature layers, and because the extracted features have randomness, a sparse self-encoder is used for adjusting the weights W ei to obtain final weights W' ei, and X is extracted again to obtain new features Z i;
2.2, the enhancement node performs secondary feature extraction by setting a random orthogonal matrix, wherein the extraction formula is as follows:
Wherein H j represents the extracted features of each enhanced node ζ j, Respectively representing an activation function, randomly generated weights and offsets of the enhancement layer;
and 2.3, taking the characteristic extracted by the characteristic node and the enhancement node in the width learning network as the final characteristic of the hidden layer, wherein the formula is as follows:
G=[Zn|Hm]
Wherein G represents the final feature extracted by the hidden layer, G epsilon R N×(nl+m), N represents the number of samples, N represents the number of nodes of each feature group, l represents the number of groups of feature groups, m represents the number of enhancement nodes, H m is the total feature extracted by the whole enhancement layer, and H m=[H1,...,Hm ];
the loss function is expressed as follows:
Where w represents the weight between the hidden layer and the output layer, γ represents the regularization coefficient of the model, Λ e is obtained by an empirical formula:
2.4, adding the popular regularization into a loss function, improving the migration effect by learning manifold characteristics between the labeled data of the source domain and the unlabeled data of the target domain, wherein an optimization function of manifold regularization is expressed as:
Wherein the method comprises the steps of Representing the predicted value of sample X i、xj, w ij represents the weight between two samples, X represents sample data, y=gw represents the output of the network, D is a diagonal matrix with the element composition on the diagonal:
l represents a graph laplace matrix, which can be calculated from the following formula:
L=D-A
Wherein the method comprises the steps of Representing a graph connection matrix consisting of w ij, and the final matrix L is represented by:
the manifold regularization expression is:
M(X)=tr(wTGTLGw)
adding the manifold regularization frame as a minimum loss function into an objective function, wherein the minimum loss function expression of the whole network is as follows:
When the number of neurons of the hidden layer is smaller than the size of the input data amount, the output weight w is:
w=(γI+GTΛeG+λGTLG)-1(GTΛey)
otherwise, when the number of neurons of the hidden layer is larger than the size of the input data quantity, the output weight w is as follows:
w=GT(γI+ΛeGGT+λLGGT)-1(Λey)。
by adopting the technology, compared with the prior art, the invention has the following beneficial effects:
The model constructed by the invention introduces a migration learning method without a target domain label into the field of reactant concentration prediction of the stirred tank, and can effectively solve the problem that the generation process of the continuous stirred tank reactor can not measure enough reactant concentration to evaluate the quality of the product; according to the migration learning method, the concentration of the reactant is selected as the prediction label, so that the problem of domain adaptation of one working condition with the label and the other working condition without the label can be solved, and the prediction performance is improved.
Drawings
FIG. 1 is a block diagram of a breadth-learning network in accordance with an embodiment of the present invention
FIG. 2 is a graph of the predicted results for conditions 1 through 2 in an embodiment of the present invention;
FIG. 3 is a graph of the predicted results for operating conditions 2 through 1 in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and examples of the present invention. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
On the contrary, the invention is intended to cover any alternatives, modifications, equivalents, and variations as may be included within the spirit and scope of the invention as defined by the appended claims. Further, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. The present invention will be fully understood by those skilled in the art without the details described herein.
Referring to fig. 1-3, the method for constructing the stirred tank reactant concentration prediction model based on width transfer learning comprises the following steps:
(1) Acquisition and integration of data
Data X s is taken as source domain data, and consists of auxiliary variables, and y s represents the output label of the source domain, namely the reactant concentration. Data X t is also composed of auxiliary variables as target domain data. Let x= [ X s,Xt ], X denote data after two fields are integrated, and since there is a difference in dimension between the source domain data and the target domain data, the data needs to be normalized, and a specific calculation formula is as follows:
Where X new denotes input data after normalization processing, X denotes raw data that has not been processed, μ denotes a mean value of the raw data, and σ denotes a variance of the raw data.
The total label y output by the network is composed of a source domain label y s and a zero matrix with the same target domain size of N t multiplied by 1.
Where N s represents the number of source domain data and N t represents the number of destination domain data.
(2) And (3) establishing a model:
The model is based on a width learning network, the width learning network is a random vector function connecting network, and the width learning is a common neural network, so that the characteristic learning capacity of deep learning can be reserved, and the characteristic of efficient learning of the random network can be realized. According to the method, the reactant concentration is selected as a key variable, and the reactant concentration of a target working condition is predicted by migration between working conditions through a width learning network based on a popular regularization frame.
The width learning network has only one hidden layer, and the width network consists of characteristic nodes and enhancement nodes.
Zi=φi(XWei+βei)
Zn=[Z1,Z2...Zn]
Where Z i represents the features learned by the ith set of feature layers and Z n represents the total features extracted by all feature sets. Phi i represents the activation function of the ith set of feature layers, W ei and beta ei are randomly generated weights and offsets of the ith set of feature layers, and the extracted features are random, so that the weight W ei is adjusted by using a sparse self-encoder to obtain a final weight W' ei. And for the ith group of feature layers, carrying out feature extraction on X again by using the adjusted weight W' ei to obtain a new feature Z i.
And then the enhancement node performs secondary feature extraction by setting a random orthogonal matrix.
Wherein H j represents the feature extracted by the j-th enhancement node, ζ j,Representing the activation function, randomly generated weights and offsets, respectively, of the jth enhanced node.
Hm=[H1,...,Hm]
Wherein H m is the total feature extracted by the enhancement node, and m represents the number of the enhancement nodes. The total feature extracted by the feature node and the enhancement node is G.
G=[Zn|Hm]
Wherein G ε R N×(nl+m), where N represents the number of samples, N represents the number of nodes per feature group, l represents the number of groups of feature groups, and m represents the number of enhancement nodes. W then represents the weight between the hidden layer to the output layer, the loss function is as follows:
Where γ represents the regularization coefficient of the model, Λ e is derived from the empirical formula:
Popular regularization can explore manifold features in the data, so that potential information of the data is revealed, and adaptability of the model is improved. In this context, popular regularization is added to the loss function to improve migration by learning manifold features between source domain tagged data and target domain untagged data.
Manifold regularization constructs structural features of data through neighbor graphs, for any point in the data, a neighbor graph can be constructed through k neighbors, each edge is assigned, a Gaussian kernel function is adopted for calculation of a weight w ij among samples, and an optimization function of manifold regularization can be expressed as:
Wherein the method comprises the steps of Representing the predicted value of sample X i、xj, w ij represents the weight between two samples, X represents sample data, y=gw represents the output of the network, D is a diagonal matrix with the element composition on the diagonal:
l represents a graph laplace matrix, which can be calculated from the following formula:
L=D-A
Wherein the method comprises the steps of Represents a graph connection matrix, consisting of w ij. The final matrix L is represented by:
the manifold regularization expression is:
M(X)=tr(wTGTLGw)
the manifold regularization framework is added to the objective function as a minimum loss function, so the final minimum loss function expression of the whole network is:
When the number of neurons of the hidden layer is smaller than the size of the input data volume, the output weight w is:
w=(γI+GTΛeG+λGTLG)-1(GTΛey)
otherwise, when the number of neurons of the hidden layer is larger than the size of the input data, the output weight w is:
w=GT(γI+ΛeGGT+λLGGT)-1(Λey)
(3) Model prediction:
For the proposed model, the Root Mean Square Error (RMSE) is used as an evaluation index, and the calculation formula is as follows:
wherein, Representing the real data, y i represents the output of the model, and k represents the number of samples of the test set. In general, a smaller RMSE means that the model's predicted value is closer to the true value, and the model's predicted effect is better.
The advantages of the method of the invention are proved by comparing the prediction results of Regularized Extreme Learning Machine (RELM) based on the labeled data of the source domain, and the prediction results are as follows:
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
Claims (2)
1. A method for constructing a stirred tank reactant concentration prediction model based on width transfer learning is characterized by comprising the following steps: the method comprises the following steps:
(1) Acquisition and integration of data
Simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in the corresponding working conditions as data samples, integrates the source domain and the target domain data, and obtains input characteristic data in the width transfer learning network so as to predict tag data in the corresponding working conditions of the target domain;
(2) Model building
The model is based on a width learning network, firstly, feature extraction is carried out on data by using feature nodes in the width learning network, because of randomness of network parameters and deviation of feature extraction, a sparse self-encoder is introduced to adjust input weights, then feature re-extraction is carried out through an enhancement layer, the feature commonly extracted by the feature nodes in the width learning network and the enhancement node is used as a final feature of a hidden layer, popular regularization is added into a loss function, and migration effect is improved by learning manifold features between labeled data of a source domain and label-free data of a target domain, so that the model can be well adapted to label-free target domain data prediction;
(3) Model prediction
Comparing the label data in the working condition corresponding to the target domain with the concentration predicted by the model, and calculating a model RMSE loss value as an evaluation standard of the model;
In the step (2), the model predicts the reactant concentration of the target working condition by the migration between working conditions through the width migration learning network based on the popular regularization frame, and the specific process is as follows:
2.1, extracting characteristics of data by adopting characteristic nodes in a width learning network, wherein the width learning network only has a hidden layer, and the width network consists of the characteristic nodes and enhancement nodes, and the characteristic extraction formula is as follows:
Zi=φi(XWei+βei)
Zn=[Z1,Z2...Zn]
Wherein Z i represents the features learned by the ith group of feature layers, Z n represents the total features extracted by all feature groups, phi i represents the activation function of the ith group of feature layers, W ei and beta ei are the weights and offsets randomly generated by the ith group of feature layers, and because the extracted features have randomness, a sparse self-encoder is used for adjusting the weights W ei to obtain a final weight W e'i, and feature extraction is carried out on X again to obtain new features Z i;
2.2, the enhancement node performs secondary feature extraction by setting a random orthogonal matrix, wherein the extraction formula is as follows:
Wherein H j represents the extracted features of each enhanced node ζ j, Respectively representing an activation function, randomly generated weights and offsets of the enhancement layer;
and 2.3, taking the characteristic extracted by the characteristic node and the enhancement node in the width learning network as the final characteristic of the hidden layer, wherein the formula is as follows:
G=[Zn|Hm]
Wherein G represents the final feature extracted by the hidden layer, G epsilon R N×(nl+m), N represents the number of samples, N represents the number of nodes of each feature group, l represents the number of groups of feature groups, m represents the number of enhancement nodes, H m is the total feature extracted by the whole enhancement layer, and H m=[H1,...,Hm ];
the loss function is expressed as follows:
Where w represents the weight between the hidden layer and the output layer, γ represents the regularization coefficient of the model, Λ e is obtained by an empirical formula:
2.4, adding the popular regularization into a loss function, improving the migration effect by learning manifold characteristics between the labeled data of the source domain and the unlabeled data of the target domain, wherein an optimization function of manifold regularization is expressed as:
Wherein the method comprises the steps of Representing the predicted value of sample X i、xj, w ij represents the weight between two samples, X represents sample data, y=gw represents the output of the network, D is a diagonal matrix with the element composition on the diagonal:
l represents a graph laplace matrix, which can be calculated from the following formula:
L=D-A
Wherein the method comprises the steps of Representing a graph connection matrix consisting of w ij, and the final matrix L is represented by:
the manifold regularization expression is:
M(X)=tr(wTGTLGw)
adding the manifold regularization frame as a minimum loss function into an objective function, wherein the minimum loss function expression of the whole network is as follows:
When the number of neurons of the hidden layer is smaller than the size of the input data amount, the output weight w is:
w=(γI+GTΛeG+λGTLG)-1(GTΛey)
otherwise, when the number of neurons of the hidden layer is larger than the size of the input data quantity, the output weight w is as follows:
w=GT(γI+ΛeGGT+λLGGT)-1(Λey)。
2. the method for constructing a predictive model of reactant concentration in a stirred tank reactor based on width transfer learning according to claim 1, wherein the process of the step (1) is as follows:
1.1, acquiring data: simulating two kinds of data with different working condition characteristics through a continuous stirred tank reactor simulation program, taking reactant concentration as a predicted label, selecting reactor temperature T and cooling flow q c as auxiliary variables, and respectively taking the two working conditions as a source domain and a target domain for training; the source domain selects auxiliary variables and labels in corresponding working conditions as data samples; the target domain selects auxiliary variables in corresponding working conditions as data samples;
1.2, integrating auxiliary variable data of a source domain and a target domain: taking data X s as source domain data, y s as output label of the source domain, namely reactant concentration, and data X t as target domain data, enabling X= [ X s,Xt ], wherein X represents integrated data of the source domain and the target domain, and carrying out standardization processing on the integrated data to eliminate dimensional differences between the source domain data and the target domain data, wherein a specific calculation formula is as follows:
wherein X new represents input data after normalization processing, X represents unprocessed original data, mu represents the mean value of the original data, and sigma represents the variance of the original data;
1.3, integrating source domain data and target tag data: the source domain label y s and the zero matrix with the same target domain size of N t multiplied by 1 form a total label y output by the network, and the total label y is shown as the following formula:
where N s represents the number of source domain data and N t represents the number of destination domain data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110999453.XA CN113570161B (en) | 2021-08-29 | 2021-08-29 | Method for constructing stirred tank reactant concentration prediction model based on width transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110999453.XA CN113570161B (en) | 2021-08-29 | 2021-08-29 | Method for constructing stirred tank reactant concentration prediction model based on width transfer learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113570161A CN113570161A (en) | 2021-10-29 |
CN113570161B true CN113570161B (en) | 2024-05-24 |
Family
ID=78172988
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110999453.XA Active CN113570161B (en) | 2021-08-29 | 2021-08-29 | Method for constructing stirred tank reactant concentration prediction model based on width transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113570161B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117272244B (en) * | 2023-11-21 | 2024-03-15 | 中国石油大学(华东) | Soft measurement modeling method integrating feature extraction and self-adaptive composition |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108920888A (en) * | 2018-04-26 | 2018-11-30 | 浙江工业大学 | It is a kind of that autoclave reaction process discrimination method is continuously stirred based on deep neural network |
CN109060001A (en) * | 2018-05-29 | 2018-12-21 | 浙江工业大学 | A kind of multiple operating modes process soft-measuring modeling method based on feature transfer learning |
CN110849627A (en) * | 2019-11-27 | 2020-02-28 | 哈尔滨理工大学 | Width migration learning network and rolling bearing fault diagnosis method based on same |
CN111461355A (en) * | 2020-03-20 | 2020-07-28 | 北京工业大学 | Dioxin emission concentration migration learning prediction method based on random forest |
CN111610768A (en) * | 2020-06-10 | 2020-09-01 | 中国矿业大学 | Intermittent process quality prediction method based on similarity multi-source domain transfer learning strategy |
CN111914708A (en) * | 2020-07-23 | 2020-11-10 | 杭州电子科技大学 | Electroencephalogram signal classification method for migration semi-supervised width learning |
CN112836432A (en) * | 2021-02-07 | 2021-05-25 | 浙江工业大学 | Indoor particle suspended matter concentration prediction method based on transfer learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710636B (en) * | 2018-11-13 | 2022-10-21 | 广东工业大学 | Unsupervised industrial system anomaly detection method based on deep transfer learning |
-
2021
- 2021-08-29 CN CN202110999453.XA patent/CN113570161B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108920888A (en) * | 2018-04-26 | 2018-11-30 | 浙江工业大学 | It is a kind of that autoclave reaction process discrimination method is continuously stirred based on deep neural network |
CN109060001A (en) * | 2018-05-29 | 2018-12-21 | 浙江工业大学 | A kind of multiple operating modes process soft-measuring modeling method based on feature transfer learning |
CN110849627A (en) * | 2019-11-27 | 2020-02-28 | 哈尔滨理工大学 | Width migration learning network and rolling bearing fault diagnosis method based on same |
CN111461355A (en) * | 2020-03-20 | 2020-07-28 | 北京工业大学 | Dioxin emission concentration migration learning prediction method based on random forest |
CN111610768A (en) * | 2020-06-10 | 2020-09-01 | 中国矿业大学 | Intermittent process quality prediction method based on similarity multi-source domain transfer learning strategy |
CN111914708A (en) * | 2020-07-23 | 2020-11-10 | 杭州电子科技大学 | Electroencephalogram signal classification method for migration semi-supervised width learning |
CN112836432A (en) * | 2021-02-07 | 2021-05-25 | 浙江工业大学 | Indoor particle suspended matter concentration prediction method based on transfer learning |
Non-Patent Citations (1)
Title |
---|
基于深度迁移学习的网络入侵检测;卢明星;杜国真;季泽旭;;计算机应用研究;20201231(第09期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113570161A (en) | 2021-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110033021B (en) | Fault classification method based on one-dimensional multipath convolutional neural network | |
CN109461025B (en) | Electric energy substitution potential customer prediction method based on machine learning | |
CN107862173B (en) | Virtual screening method and device for lead compound | |
CN111899510A (en) | Intelligent traffic system flow short-term prediction method and system based on divergent convolution and GAT | |
CN111814956B (en) | Multi-task learning air quality prediction method based on multi-dimensional secondary feature extraction | |
CN111416797B (en) | Intrusion detection method for optimizing regularization extreme learning machine by improving longicorn herd algorithm | |
CN113344615B (en) | Marketing campaign prediction method based on GBDT and DL fusion model | |
CN112989711B (en) | Aureomycin fermentation process soft measurement modeling method based on semi-supervised ensemble learning | |
CN110600085B (en) | Tree-LSTM-based organic matter physicochemical property prediction method | |
CN113688253B (en) | Hierarchical perception temporal knowledge graph representation learning method | |
CN114219181A (en) | Wind power probability prediction method based on transfer learning | |
CN113570161B (en) | Method for constructing stirred tank reactant concentration prediction model based on width transfer learning | |
CN115759461A (en) | Internet of things-oriented multivariate time sequence prediction method and system | |
CN112784920A (en) | Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part | |
Peres et al. | Knowledge based modular networks for process modelling and control | |
CN114596726B (en) | Parking berth prediction method based on interpretable space-time attention mechanism | |
CN110245398B (en) | Soft measurement deep learning method for thermal deformation of air preheater rotor | |
He et al. | A data-attribute-space-oriented double parallel (DASODP) structure for enhancing extreme learning machine: Applications to regression datasets | |
CN113947725A (en) | Hyperspectral image classification method based on convolution width migration network | |
CN114239397A (en) | Soft measurement modeling method based on dynamic feature extraction and local weighted deep learning | |
CN117668743A (en) | Time sequence data prediction method of association time-space relation | |
CN110288002B (en) | Image classification method based on sparse orthogonal neural network | |
Cai et al. | EST-NAS: An evolutionary strategy with gradient descent for neural architecture search | |
CN111797979A (en) | Vibration transmission system based on LSTM model | |
CN116415177A (en) | Classifier parameter identification method based on extreme learning machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |