CN109002686B - Multi-grade chemical process soft measurement modeling method capable of automatically generating samples - Google Patents

Multi-grade chemical process soft measurement modeling method capable of automatically generating samples Download PDF

Info

Publication number
CN109002686B
CN109002686B CN201810692852.XA CN201810692852A CN109002686B CN 109002686 B CN109002686 B CN 109002686B CN 201810692852 A CN201810692852 A CN 201810692852A CN 109002686 B CN109002686 B CN 109002686B
Authority
CN
China
Prior art keywords
data
generator
function
discriminator
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810692852.XA
Other languages
Chinese (zh)
Other versions
CN109002686A (en
Inventor
刘毅
陈波成
徐东伟
陈壮志
宣琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Publication of CN109002686A publication Critical patent/CN109002686A/en
Application granted granted Critical
Publication of CN109002686B publication Critical patent/CN109002686B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A multi-grade chemical process soft measurement modeling method capable of automatically generating samples comprises the following steps: (1) dividing data acquired from a multi-grade chemical process into a training set and a test set as original data; (2) establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network; (3) generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set; (4) and (3) training soft measurement modeling by using the new training set as driving data, adjusting parameters of a soft measurement model to adapt to the new training set, and predicting key quality variables of the multi-grade chemical process by using the trained soft measurement model. The method utilizes the automatic generation countermeasure network to generate data so as to make up the deficiency of data quantity and improve the prediction accuracy of the soft measurement model based on data driving.

Description

Multi-grade chemical process soft measurement modeling method capable of automatically generating samples
Technical Field
The invention relates to the field of chemical process soft measurement modeling, in particular to a method for multi-grade chemical process soft measurement modeling based on generation of a countermeasure network.
Background
In chemical processes, the estimation of process key quality variables plays an important role in the continuous and smooth operation of production equipment, the assurance of product quality and the full play of the production capacity of the equipment. However, some important quality variables, such as the activity of the catalyst, the melt index of the polymer, certain quality indexes of petroleum and petrochemical products, the thallus and product concentration of the fermentation process, are difficult to be directly measured by using an online sensor, and therefore, a method of soft measurement modeling is required to predict the important quality variables.
Under the condition of deep understanding and mechanism acquaintance of the chemical process technology, the mechanism modeling can obtain good effect. However, in the actual chemical process, for many complex chemical processes, due to their own complexity and lack of related expert knowledge, it is difficult to obtain an accurate mechanism model. In addition, the mechanism analysis is usually established on the basis of ideal steady-state work, and for some chemical processes which cannot continuously work at a stable point, the whole chemical process is difficult to characterize. For these problems, it is feasible to use data-driven to build a soft-measurement prediction model. Useful modeling information is extracted from historical input and output data, so that the principal derivative variable and the auxiliary variable of the model are analyzed without knowing too much process knowledge, and the mathematical relationship between the variables is constructed, so that the method is a universal and effective system modeling method. The method has the advantages that the modeling method is simple, the reaction mechanism of the process does not need to be deeply known, and the accuracy is high.
In the soft measurement method, a large number of models based on data driving, such as an Artificial Neural Network (ANN), Partial Least Squares (PLS), Support Vector Regression (SVR), deep confidence network (DBN) and the like, are applied to the field of chemical process, and certain results are obtained. However, there are the following problems: firstly, a sufficient amount of data is required to ensure the accuracy of the prediction result; secondly, the cost of chemical process data acquisition is high, and the acquisition period is long.
Disclosure of Invention
In order to overcome the defects of high data acquisition cost and low prediction accuracy of the conventional multi-grade chemical process soft measurement method, the invention provides the multi-grade chemical process soft measurement modeling method for automatically generating the sample, which generates data by utilizing an automatic generation countermeasure network (AGAN for short) based on gradient punishment and Wasserstein distance so as to make up the defect of data quantity and improve the prediction accuracy of a data-driven soft measurement model.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a multi-grade chemical process soft measurement modeling method for automatically generating samples comprises the following steps:
(1) partitioning multi-brand chemical process data sets
Dividing data acquired from a multi-grade chemical process into a training set and a testing set according to a set proportion as original data so as to facilitate cross validation;
(2) AGAN model principle and training process
Establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network;
(3) construction of a New training set
Generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set;
(4) adjusting soft measurement model parameters according to the new training set
And (3) training soft measurement modeling by using the new training set as driving data, adjusting parameters of a soft measurement model to adapt to the new training set, and predicting key quality variables of the multi-brand chemical process by using the trained soft measurement model.
Further, the process of the step (2) is as follows:
step 2.1: establishing a generation countermeasure network based on the gradient penalty and the Wasserstein distance;
the Wasserstein distance is used to measure the distance between two distributions, and is calculated as follows:
Figure BDA0001712947860000031
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as a real data x and a generated data y which are obtained by sampling gamma; the | x-y | is the distance between the real data and the generated data;
Figure BDA0001712947860000032
as desired; inf represents a takedown bound; the whole W function is expressed as a lower bound taken for an expected value in all the joint distributions, and the expected value is defined as a Wasserstein distance;
converting the above calculation formula into the following formula:
Figure BDA0001712947860000033
wherein: x obeys PrThe distribution of (a);
Figure BDA0001712947860000034
compliance PgThe distribution of (a); f (x) is a function comprising x,
Figure BDA0001712947860000035
to comprise
Figure BDA0001712947860000036
A function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is a constant K ≧ 0 for any x in the defined domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding K
Figure BDA0001712947860000037
The upper bound of (c);
Figure BDA0001712947860000038
wherein: f. ofw(x) Is a function containing x;
Figure BDA0001712947860000039
to comprise
Figure BDA00017129478600000310
A function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the discriminator D and the generator G in the generation countermeasure network are as follows:
Figure BDA0001712947860000041
Figure BDA0001712947860000042
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);
Figure BDA0001712947860000043
compliance PgThe distribution of (a); d (x) is a function comprising x;
Figure BDA0001712947860000044
to comprise
Figure BDA0001712947860000045
A function of (a);
the loss function of the improved discriminator is as follows:
Figure BDA0001712947860000046
Figure BDA0001712947860000047
wherein:
Figure BDA0001712947860000048
is at the same time
Figure BDA0001712947860000049
And the value of the random interpolation sample on the connecting line of x;
Figure BDA00017129478600000410
is composed of
Figure BDA00017129478600000411
The distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty;
Figure BDA00017129478600000412
is composed of
Figure BDA00017129478600000413
A 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generates data to be judged as real data, the second part is the expectation of the probability that the real data is judged as the real data, and the third part is the gradient punishment;
step 2.2: network structure for generating countermeasure network based on gradient penalty and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator; the generator is composed of a plurality of layers of perceptrons, the input is Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to an application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation functions of the input layer and the hidden layer are modified linear units ReLU; the discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units;
the Sigmoid activation function has better activation near 0, the gradient of the positive saturation area and the negative saturation area is close to 0, gradient dispersion can be caused, and the gradient of the ReLU activation function in the part larger than 0 is a constant;
step 2.3: the training process for generating the countermeasure network based on the gradient penalty and the Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data; the input of the generator is a group of Gaussian distributed noises, which provides a pre-set data distribution for the generator, and the generator can convert the data distribution into a group of virtual data, namely generated data; the input of the discriminator is real data and generated data, and the output of the discriminator is the probability that the input sample is the real data; according to the loss function of the generator, the loss of the generator can be reduced by maximizing the output probability of the discriminator, namely the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data; according to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is; the method is characterized in that the antagonism of a discriminator and a generator in the antagonistic network is generated, and when the discriminator and the generator reach balance, the training process is ended;
step 2.4: generation of the countermeasure network parameter updates based on the gradient penalties and the Wasserstein distance, the process is as follows:
initializing relevant parameters: gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; theta is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; m is the number of samples;
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between;
2.4.2) formula for calculating the discriminator loss is as follows:
Figure BDA0001712947860000061
Figure BDA0001712947860000062
Figure BDA0001712947860000063
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;
Figure BDA0001712947860000064
to comprise
Figure BDA0001712947860000065
A function of (a); dw(x) Is a function containing x;
Figure BDA0001712947860000066
is composed of
Figure BDA0001712947860000067
A 2-norm of the gradient of the function;
2.4.3) optimizing parameters of the discriminator according to an adaptive moment estimation algorithm (Adam for short), wherein a gradient calculation formula is as follows:
Figure BDA0001712947860000068
wherein:
Figure BDA0001712947860000069
the gradient of (1) is realized by adopting a batch gradient descent method, and the w parameter is updated by using m samples each time;
2.4.4) repeating steps 1-4ncriticSecondly;
2.4.5) sampling m samples from P (z)
Figure BDA00017129478600000610
According to the Adam gradient descent method, parameters of a generator are optimized, and a gradient calculation formula is as follows:
Figure BDA00017129478600000611
wherein:
Figure BDA00017129478600000612
is composed of
Figure BDA00017129478600000613
The gradient of (2) is obtained by adopting a batch gradient descent method, and the theta parameter is updated by using m samples each time;
the method actively generates modeling data by using the generated countermeasure network, and can generate more data reflecting process characteristics under the condition of less original training data. The method comprises the steps of training a generation countermeasure network based on Wasserstein distance and gradient punishment by using sample data acquired in a multi-brand chemical process, automatically increasing the sample data volume by using a generator in a generative model after the training is finished, combining original training data and generated data into new training data, and inputting the new training data into a traditional soft measurement model so as to improve the accuracy of the model for predicting the key quality quantity of the multi-brand chemical process.
The invention has the following beneficial effects: the method uses the generation countermeasure network to generate data, automatically increases the sample data amount, is simple and convenient, avoids high-cost and long-period data acquisition in the multi-brand chemical process soft measurement modeling, and effectively improves the prediction accuracy of the soft measurement model.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2a is a graph comparing the prediction error values of the first sample grade in the MLP and AGAN + MLP models;
FIG. 2b is a graph comparing the prediction error values for the second sample grade in both the MLP and AGAN + MLP models;
FIG. 2c is a graph comparing the predicted error values for the first brand sample in both the PLS and AGAN + PLS models;
FIG. 2d is a graph comparing the predicted error values for the second brand samples in both the PLS and AGAN + PLS models;
in fig. 2a, the training data sets of the MLP model and the AGAN + MLP model are 50 pieces of original training data, 50 pieces of original training data plus generated data 400 pieces, and 22 pieces of test data sets, respectively; in fig. 2b, the training data sets of the MLP model and the AGAN + MLP model are 170 pieces of original training data, 170 pieces of original training data plus 100 pieces of generated data, and 41 pieces of test data sets, respectively; in fig. 2c, the training data sets of the PLS model and the AGAN + PLS model are 50 pieces of original training data and 50 pieces of original training data plus 100 pieces of generated data, respectively, and the test data set is 22 pieces; in fig. 2d, the training data sets of the PLS model and the AGAN + PLS model are 170 pieces of original training data and 170 pieces of original training data plus generation data 200 pieces, respectively, and the test data set is 41 pieces.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 2d, a multi-grade chemical process soft measurement modeling method for automatically generating samples includes the following steps:
(1) partitioning multi-brand chemical process data sets
Elements of the multi-grade chemical process data set are high-dimensional vectors, wherein one dimension is a variable to be predicted, and the other dimensions are known variables. The data collected from the multi-grade chemical process is used as original data and divided into a training set and a testing set according to a set proportion so as to facilitate cross validation.
(2) The principle and training of the AGAN model comprise the following steps:
step 2.1: principle for generating countermeasure network based on gradient punishment and Wasserstein distance
The Wasserstein distance is used for measuring the distance between two distributions, and the calculation formula is as follows:
Figure BDA0001712947860000081
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as a real data x and a generated data y which are obtained by sampling gamma; the | x-y | is the distance between the real data and the generated data;
Figure BDA0001712947860000082
as desired; inf represents a takedown bound; the entire W function is represented as the lower bound taken on the expected value in all the joint distributions, which is defined as the Wasserstein distance.
Since the above formula cannot be solved directly, it is converted into the following formula:
Figure BDA0001712947860000083
wherein: x obeys PrThe distribution of (a);
Figure BDA0001712947860000084
compliance PgThe distribution of (a); f (x) is a function comprising x,
Figure BDA0001712947860000085
to comprise
Figure BDA0001712947860000086
A function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is oneThe constant K is more than or equal to 0 and any x in the definition domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding K
Figure BDA0001712947860000091
The upper bound of (c).
Figure BDA0001712947860000092
Wherein: f. ofw(x) Is a function containing x;
Figure BDA0001712947860000093
to comprise
Figure BDA0001712947860000094
A function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the arbiter and generator (G for short) in the generation of the countermeasure network are as follows:
Figure BDA0001712947860000095
Figure BDA0001712947860000096
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);
Figure BDA0001712947860000097
compliance PgThe distribution of (a); d (x) is a packetA function containing x;
Figure BDA0001712947860000098
to comprise
Figure BDA0001712947860000099
As a function of (c).
The AGAN adds a gradient punishment (GP for short) measure on the basis of the theory to solve the problem of gradient disappearance or gradient explosion existing in the theory. The loss function of the improved discriminator is as follows:
Figure BDA00017129478600000910
Figure BDA00017129478600000911
wherein:
Figure BDA00017129478600000912
is at the same time
Figure BDA00017129478600000913
And the value of the random interpolation sample on the connecting line of x;
Figure BDA00017129478600000914
is composed of
Figure BDA00017129478600000915
The distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty, generally taken as 10;
Figure BDA00017129478600000916
is composed of
Figure BDA00017129478600000917
A 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generated data is discriminated as real data, and the second part is the discrimination of the real data as real dataThe third component is the gradient penalty, depending on the probability desired.
Step 2.2: network structure for generating countermeasure network based on gradient punishment and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator. The generator is composed of a plurality of layers of perceptrons, the input is the Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation function of the input layer and the hidden layer is a modified linear unit (called ReLU for short). The discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units.
The Sigmoid activation function has better activation near 0, the gradient of the positive saturation area and the negative saturation area is close to 0, gradient dispersion can be caused, and the gradient of the ReLU activation function in the part larger than 0 is constant, so that the gradient dispersion phenomenon is avoided. The ReLU activation function also has the advantages of sparsity and increased training speed.
Step 2.3: the training process for generating the countermeasure network based on the gradient penalty and the Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data. The generator inputs a set of gaussian distributed noise, which provides the generator with a pre-determined data distribution, which the generator can convert into a set of virtual data, i.e. the generated data. The input to the discriminator is the true data and the generated data, and the output of the discriminator is the probability that the input sample is true data. According to the generator loss function, maximizing the output probability of the discriminator can reduce the loss of the generator, namely, the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data. According to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is. This is to generate the adversaries of the arbiter and the generator in the adversary network. When the arbiter and generator reach equilibrium, the training process ends at this point.
Step 2.4: a method for generating countermeasure network parameter updates based on gradient penalties and Wasserstein distances, the process is as follows:
relevant parameters are initialized. Gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; theta is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; and m is the number of samples.
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between.
2.4.2) formula for calculating the discriminator loss is as follows:
Figure BDA0001712947860000111
Figure BDA0001712947860000112
Figure BDA0001712947860000113
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;
Figure BDA0001712947860000114
to comprise
Figure BDA0001712947860000115
A function of (a); dw(x) Is a bagA function containing x;
Figure BDA0001712947860000116
is composed of
Figure BDA0001712947860000117
2 norm of gradient of function.
2.4.3) optimizing parameters of the discriminator according to an adaptive moment estimation algorithm (Adam for short), wherein a gradient calculation formula is as follows:
Figure BDA0001712947860000118
wherein:
Figure BDA0001712947860000119
is composed of
Figure BDA00017129478600001110
The gradient of (2) is updated by a batch gradient descent method, and the w parameter is updated by using m samples each time.
2.4.4) repeating steps 1-4ncriticNext, the process is carried out.
2.4.5) sampling m samples from P (z)
Figure BDA00017129478600001111
According to the Adam gradient descent method, parameters of a generator are optimized, and a gradient calculation formula is as follows:
Figure BDA00017129478600001112
wherein:
Figure BDA00017129478600001113
is composed of
Figure BDA00017129478600001114
The gradient of (2) is obtained by a batch gradient descent method, and the theta parameter is updated by using m samples each time.
The self-adaptive moment estimation algorithm is high in calculation efficiency, high in convergence speed, small in calculation memory occupation, suitable for the situation that the data volume is large or the parameters are large, has intuitive explanation on the related hyper-parameters, and does not need to be adjusted under the common condition.
(3) Construction of a New training set
And (3) independently extracting a generator in the AGAN to generate a virtual sample, wherein the generated virtual sample has the same dimension as the original training data and can be added into the original training set as new training data.
(4) Adjusting soft measurement model parameters according to the new training set
Due to the change of the training set, the soft measurement model parameters need to be adjusted to adapt to the new training data. The model is optimized by traversing a given set of parameters using a grid search algorithm to avoid the tedious of manually adjusting the parameters. The generalization ability of statistical analysis and machine learning algorithms on training data independent data sets was evaluated using cross-validation. Finally, the optimal parameters are automatically selected by adopting a method combining a grid search algorithm and ten-fold cross validation, so that the aim of fully utilizing new training data by the soft measurement model is fulfilled.
Example (c): a multi-grade chemical process soft measurement modeling method for automatically generating samples comprises the following steps:
(1) collecting and dividing data set of multi-brand chemical process
The first type of brand data comprises 72 pieces, 50 pieces of brand data are divided into training sets, and 22 pieces of brand data are divided into testing sets. The second type of brand data are 211, 170 of the brand data are divided into training sets, and 41 of the brand data are divided into test sets.
(2) Training AGAN Using Multi-brand chemical Process data
The parameters of the network are initialized, the batch size (batch size) is set to 30, the gaussian distribution with noise of [0,1] is input, and all training sets are used to train AGAN, the network weights are iterated until the loss function converges and the discriminator and the generator reach equilibrium.
(3) Automatically generating multi-grade chemical process data by utilizing AGAN, and constructing a new training set to train a soft measurement model
A generator in AGAN is used, Gaussian noise of [0,1] is input to generate a certain amount of samples as virtual data, the samples and an original training set (real data) are used as driving data and input into a traditional soft measurement model together, and parameters of the soft measurement prediction model are adjusted to adapt to new training data.
(4) Multi-brand chemical process data key quality variable prediction experimental result
The AGAN + MLP and the AGAN + PLS of the method are respectively compared with a single traditional soft measurement model MLP and a single traditional soft measurement model PLS in an experiment, the prediction mean square error is used as an evaluation standard (RMSE for short), and the smaller the value is, the better the value is. The comparison results are shown in tables 1 and 2, which list the experimental results of the multi-brand original training data and the multi-brand generated data for predicting the key quality variables under different traditional soft measurement models. As known from results, the RMSE of the method is smaller than that of the traditional soft measurement model, and the prediction result of the key quality variable of the multi-grade chemical process data is better.
As shown in fig. 2a to 2b, the AGAN + MLP of the present method is compared with a single conventional soft measurement model MLP, and as shown in fig. 2c to 2d, the AGAN + MLP of the present method is compared with a single conventional soft measurement model MLP. The effectiveness of the invention is verified in many ways by setting multiple grades of sample data and different soft measurement models experimentally. The comparison of the proposed method with the prediction error values of a conventional soft measurement model is detailed in fig. 2 a-2 d. As can be seen from the figure, the prediction error value of the method is smaller than that of the traditional soft measurement model, and the accuracy of predicting the key quality variables of the multi-grade chemical process data can be improved. These results show that the AGAN can automatically generate more data representing process characteristics to improve the capability of capturing feature information of a test sample set by a conventional soft measurement model.
Table 1 shows the comparison of the prediction results of the method with the conventional MLP model;
Figure BDA0001712947860000141
TABLE 1
Table 2 shows the comparison of the prediction results of the present method with those of the conventional PLS model;
Figure BDA0001712947860000142
TABLE 2
The method of the invention actively generates modeling data by utilizing AGAN, can generate more data which are in accordance with the distribution of the original data set under the condition of less original training data, can be suitable for predicting key quality variables of a multi-brand chemical process, and has universality and universality. Meanwhile, the cost of sample data re-acquisition, such as the cost of manual sample tags, the cost of sample acquisition time, the cost of sample acquisition hardware and the like, is avoided.
The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims (1)

1. A multi-grade chemical process soft measurement modeling method capable of automatically generating samples is characterized by comprising the following steps:
(1) partitioning multi-brand chemical process data sets
Dividing data acquired from a multi-grade chemical process into a training set and a testing set according to a set proportion as original data so as to facilitate cross validation;
(2) AGAN model principle and training process
Establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network;
(3) construction of a New training set
Generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set;
(4) adjusting soft measurement model parameters according to the new training set
The new training set is used as driving data to train soft measurement modeling, parameters of a soft measurement model are adjusted to adapt to the new training set, and the trained soft measurement model is used for predicting key quality variables of the multi-grade chemical process;
the process of the step (2) is as follows:
step 2.1: establishing a generation countermeasure network based on the gradient penalty and the Wasserstein distance;
the Wasserstein distance is used to measure the distance between two distributions, and is calculated as follows:
Figure FDA0003385596440000011
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as sampling from gamma to obtain a real data x and a generated data y; the | x-y | is the distance between the real data and the generated data;
Figure FDA0003385596440000021
as desired; inf represents a takedown bound; the whole W function is expressed as a lower bound taken for an expected value in all the joint distributions, and the expected value is defined as a Wasserstein distance;
converting the above calculation formula into the following formula:
Figure FDA0003385596440000022
wherein: x obeys PrThe distribution of (a);
Figure FDA0003385596440000023
compliance PgThe distribution of (a); f (x) is a function comprising x,
Figure FDA0003385596440000024
to comprise
Figure FDA0003385596440000025
A function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is a constant K ≧ 0 for any x in the defined domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding K
Figure FDA0003385596440000026
The upper bound of (c);
Figure FDA0003385596440000027
wherein: f. ofw(x) Is a function containing x;
Figure FDA0003385596440000028
to comprise
Figure FDA0003385596440000029
A function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the discriminator D and the generator G in the generation countermeasure network are as follows:
Figure FDA00033855964400000210
Figure FDA00033855964400000211
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);
Figure FDA00033855964400000212
compliance PgThe distribution of (a); d (x) is a function comprising x;
Figure FDA00033855964400000213
to comprise
Figure FDA00033855964400000214
A function of (a);
the loss function of the improved discriminator is as follows:
Figure FDA0003385596440000031
Figure FDA0003385596440000032
wherein:
Figure FDA0003385596440000033
is at the same time
Figure FDA0003385596440000034
And the value of the random interpolation sample on the connecting line of x;
Figure FDA0003385596440000035
is composed of
Figure FDA0003385596440000036
The distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty;
Figure FDA0003385596440000037
is composed of
Figure FDA0003385596440000038
A 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generates data to be judged as real data, the second part is the expectation of the probability that the real data is judged as the real data, and the third part is the gradient punishment;
step 2.2: network structure for generating countermeasure network based on gradient penalty and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator; the generator is composed of a plurality of layers of perceptrons, the input is Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to an application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation functions of the input layer and the hidden layer are modified linear units ReLU; the discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons needs to be determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units;
step 2.3: training process for generating countermeasure network based on gradient penalty and Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data; the input of the generator is a group of Gaussian distribution noises, a preset data distribution is provided for the generator, and the generator can convert the data distribution into a group of virtual data, namely generated data; the input of the discriminator is real data and generated data, and the output of the discriminator is the probability that the input sample is the real data; according to the loss function of the generator, the loss of the generator can be reduced by maximizing the output probability of the discriminator, namely the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data; according to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is; the generation of the antagonism of the arbiter and the generator in the antagonism network is completed when the arbiter and the generator reach the equilibrium;
step 2.4: generation of the countermeasure network parameter updates based on the gradient penalties and the Wasserstein distance, the process is as follows:
initializing relevant parameters: gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; θ is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; m is the number of samples;
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between;
2.4.2) formula for calculating the discriminator loss is as follows:
Figure FDA0003385596440000041
Figure FDA0003385596440000042
Figure FDA0003385596440000043
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;
Figure FDA0003385596440000044
to comprise
Figure FDA0003385596440000045
A function of (a); dw(x) Is a function containing x;
Figure FDA0003385596440000046
is composed of
Figure FDA0003385596440000047
A 2-norm of the gradient of the function;
2.4.3) optimizing parameters of a discriminator according to an adaptive moment estimation algorithm Adam, wherein a gradient calculation formula is as follows:
Figure FDA0003385596440000048
wherein:
Figure FDA0003385596440000049
is composed of
Figure FDA00033855964400000410
The gradient of (1) is realized by adopting a batch gradient descent method, and the w parameter is updated by using m samples each time;
2.4.4) repeating steps 1-4ncriticSecondly;
2.4.5) sampling m samples from P (z)
Figure FDA0003385596440000051
According to the Adam gradient descent method, the parameters of the generator are optimized, and the gradient calculation formula is as follows:
Figure FDA0003385596440000052
wherein:
Figure FDA0003385596440000053
is composed of
Figure FDA0003385596440000054
The gradient of (2) is updated by using m samples each time by adopting a batch gradient descent method.
CN201810692852.XA 2018-04-26 2018-06-29 Multi-grade chemical process soft measurement modeling method capable of automatically generating samples Active CN109002686B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2018103826280 2018-04-26
CN201810382628 2018-04-26

Publications (2)

Publication Number Publication Date
CN109002686A CN109002686A (en) 2018-12-14
CN109002686B true CN109002686B (en) 2022-04-08

Family

ID=64600767

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810692852.XA Active CN109002686B (en) 2018-04-26 2018-06-29 Multi-grade chemical process soft measurement modeling method capable of automatically generating samples

Country Status (1)

Country Link
CN (1) CN109002686B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111376910B (en) * 2018-12-29 2022-04-15 北京嘀嘀无限科技发展有限公司 User behavior identification method and system and computer equipment
SG10201900755WA (en) * 2019-01-28 2020-08-28 Wilmar International Ltd Methods and system for processing lipid contents of at least one oil sample and simulating at least one training sample, and for predicting a blending formula, amongst others
CN110310345A (en) * 2019-06-11 2019-10-08 同济大学 A kind of image generating method generating confrontation network based on hidden cluster of dividing the work automatically
CN110598806A (en) * 2019-07-29 2019-12-20 合肥工业大学 Handwritten digit generation method for generating countermeasure network based on parameter optimization
CN111028146B (en) * 2019-11-06 2022-03-18 武汉理工大学 Image super-resolution method for generating countermeasure network based on double discriminators
TWI708190B (en) 2019-11-15 2020-10-21 財團法人工業技術研究院 Image recognition method, training system of object recognition model and training method of object recognition model
CN111192221B (en) * 2020-01-07 2024-04-16 中南大学 Aluminum electrolysis fire hole image repairing method based on deep convolution generation countermeasure network
CN111419213A (en) * 2020-03-11 2020-07-17 哈尔滨工业大学 ECG electrocardiosignal generation method based on deep learning
CN111794741B (en) * 2020-08-11 2023-08-18 中国石油天然气集团有限公司 Method for realizing sliding directional drilling simulator
CN112597702B (en) * 2020-12-21 2022-07-19 电子科技大学 Pneumatic modeling generation type confrontation network model training method based on radial basis function
CN112668196B (en) * 2021-01-04 2023-06-09 西安理工大学 Mechanism and data hybrid-driven generation type countermeasure network soft measurement modeling method
CN112989635B (en) * 2021-04-22 2022-05-06 昆明理工大学 Integrated learning soft measurement modeling method based on self-encoder diversity generation mechanism
CN113255732B (en) * 2021-04-29 2022-03-18 华中科技大学 Elastic workpiece robot grinding and polishing surface roughness prediction method based on virtual sample
CN113177078B (en) * 2021-04-30 2022-06-17 哈尔滨工业大学(威海) Approximate query processing algorithm based on condition generation model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451102A (en) * 2017-07-28 2017-12-08 江南大学 A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks
CN107563155A (en) * 2017-08-08 2018-01-09 中国科学院信息工程研究所 A kind of safe steganography method and device based on generation confrontation network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451102A (en) * 2017-07-28 2017-12-08 江南大学 A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method
CN107563155A (en) * 2017-08-08 2018-01-09 中国科学院信息工程研究所 A kind of safe steganography method and device based on generation confrontation network
CN107563510A (en) * 2017-08-14 2018-01-09 华南理工大学 A kind of WGAN model methods based on depth convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Improved Training of Wasserstein GANs;Gulrajani 等;《http://arxiv.org/abs/1704.00028v1》;20170331;第1-11页 *

Also Published As

Publication number Publication date
CN109002686A (en) 2018-12-14

Similar Documents

Publication Publication Date Title
CN109002686B (en) Multi-grade chemical process soft measurement modeling method capable of automatically generating samples
CN109992921B (en) On-line soft measurement method and system for thermal efficiency of boiler of coal-fired power plant
CN111144552B (en) Multi-index grain quality prediction method and device
CN109558677A (en) A kind of hot rolling strip crown prediction technique based on data-driven
CN113469470B (en) Energy consumption data and carbon emission correlation analysis method based on electric brain center
CN109472397B (en) Polymerization process parameter adjusting method based on viscosity change
CN109934422B (en) Neural network wind speed prediction method based on time series data analysis
CN109884892A (en) Process industry system prediction model based on crosscorrelation time lag grey correlation analysis
CN110443417A (en) Multiple-model integration load forecasting method based on wavelet transformation
CN106656357B (en) Power frequency communication channel state evaluation system and method
CN113822499B (en) Train spare part loss prediction method based on model fusion
CN113408869A (en) Power distribution network construction target risk assessment method
CN112308298B (en) Multi-scenario performance index prediction method and system for semiconductor production line
CN110264079A (en) Hot-rolled product qualitative forecasting method based on CNN algorithm and Lasso regression model
CN112364560A (en) Intelligent prediction method for working hours of mine rock drilling equipment
CN115982141A (en) Characteristic optimization method for time series data prediction
CN110110447B (en) Method for predicting thickness of strip steel of mixed frog leaping feedback extreme learning machine
CN108073978A (en) A kind of constructive method of the ultra-deep learning model of artificial intelligence
CN116088453A (en) Production quality prediction model training method and device and production quality monitoring method
CN114692507A (en) Counting data soft measurement modeling method based on stacking Poisson self-encoder network
CN109920489A (en) It is a kind of that model and method for building up are hydrocracked based on Lasso-CCF-CNN
CN117370766A (en) Satellite mission planning scheme evaluation method based on deep learning
CN105354644A (en) Financial time series prediction method based on integrated empirical mode decomposition and 1-norm support vector machine quantile regression
CN116662925A (en) Industrial process soft measurement method based on weighted sparse neural network
CN108073979A (en) A kind of ultra-deep study of importing artificial intelligence knows method for distinguishing for image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant