CN109002686B - Multi-grade chemical process soft measurement modeling method capable of automatically generating samples - Google Patents
Multi-grade chemical process soft measurement modeling method capable of automatically generating samples Download PDFInfo
- Publication number
- CN109002686B CN109002686B CN201810692852.XA CN201810692852A CN109002686B CN 109002686 B CN109002686 B CN 109002686B CN 201810692852 A CN201810692852 A CN 201810692852A CN 109002686 B CN109002686 B CN 109002686B
- Authority
- CN
- China
- Prior art keywords
- data
- generator
- function
- discriminator
- gradient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A multi-grade chemical process soft measurement modeling method capable of automatically generating samples comprises the following steps: (1) dividing data acquired from a multi-grade chemical process into a training set and a test set as original data; (2) establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network; (3) generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set; (4) and (3) training soft measurement modeling by using the new training set as driving data, adjusting parameters of a soft measurement model to adapt to the new training set, and predicting key quality variables of the multi-grade chemical process by using the trained soft measurement model. The method utilizes the automatic generation countermeasure network to generate data so as to make up the deficiency of data quantity and improve the prediction accuracy of the soft measurement model based on data driving.
Description
Technical Field
The invention relates to the field of chemical process soft measurement modeling, in particular to a method for multi-grade chemical process soft measurement modeling based on generation of a countermeasure network.
Background
In chemical processes, the estimation of process key quality variables plays an important role in the continuous and smooth operation of production equipment, the assurance of product quality and the full play of the production capacity of the equipment. However, some important quality variables, such as the activity of the catalyst, the melt index of the polymer, certain quality indexes of petroleum and petrochemical products, the thallus and product concentration of the fermentation process, are difficult to be directly measured by using an online sensor, and therefore, a method of soft measurement modeling is required to predict the important quality variables.
Under the condition of deep understanding and mechanism acquaintance of the chemical process technology, the mechanism modeling can obtain good effect. However, in the actual chemical process, for many complex chemical processes, due to their own complexity and lack of related expert knowledge, it is difficult to obtain an accurate mechanism model. In addition, the mechanism analysis is usually established on the basis of ideal steady-state work, and for some chemical processes which cannot continuously work at a stable point, the whole chemical process is difficult to characterize. For these problems, it is feasible to use data-driven to build a soft-measurement prediction model. Useful modeling information is extracted from historical input and output data, so that the principal derivative variable and the auxiliary variable of the model are analyzed without knowing too much process knowledge, and the mathematical relationship between the variables is constructed, so that the method is a universal and effective system modeling method. The method has the advantages that the modeling method is simple, the reaction mechanism of the process does not need to be deeply known, and the accuracy is high.
In the soft measurement method, a large number of models based on data driving, such as an Artificial Neural Network (ANN), Partial Least Squares (PLS), Support Vector Regression (SVR), deep confidence network (DBN) and the like, are applied to the field of chemical process, and certain results are obtained. However, there are the following problems: firstly, a sufficient amount of data is required to ensure the accuracy of the prediction result; secondly, the cost of chemical process data acquisition is high, and the acquisition period is long.
Disclosure of Invention
In order to overcome the defects of high data acquisition cost and low prediction accuracy of the conventional multi-grade chemical process soft measurement method, the invention provides the multi-grade chemical process soft measurement modeling method for automatically generating the sample, which generates data by utilizing an automatic generation countermeasure network (AGAN for short) based on gradient punishment and Wasserstein distance so as to make up the defect of data quantity and improve the prediction accuracy of a data-driven soft measurement model.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a multi-grade chemical process soft measurement modeling method for automatically generating samples comprises the following steps:
(1) partitioning multi-brand chemical process data sets
Dividing data acquired from a multi-grade chemical process into a training set and a testing set according to a set proportion as original data so as to facilitate cross validation;
(2) AGAN model principle and training process
Establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network;
(3) construction of a New training set
Generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set;
(4) adjusting soft measurement model parameters according to the new training set
And (3) training soft measurement modeling by using the new training set as driving data, adjusting parameters of a soft measurement model to adapt to the new training set, and predicting key quality variables of the multi-brand chemical process by using the trained soft measurement model.
Further, the process of the step (2) is as follows:
step 2.1: establishing a generation countermeasure network based on the gradient penalty and the Wasserstein distance;
the Wasserstein distance is used to measure the distance between two distributions, and is calculated as follows:
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as a real data x and a generated data y which are obtained by sampling gamma; the | x-y | is the distance between the real data and the generated data;as desired; inf represents a takedown bound; the whole W function is expressed as a lower bound taken for an expected value in all the joint distributions, and the expected value is defined as a Wasserstein distance;
converting the above calculation formula into the following formula:
wherein: x obeys PrThe distribution of (a);compliance PgThe distribution of (a); f (x) is a function comprising x,to compriseA function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is a constant K ≧ 0 for any x in the defined domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding KThe upper bound of (c);
wherein: f. ofw(x) Is a function containing x;to compriseA function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the discriminator D and the generator G in the generation countermeasure network are as follows:
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);compliance PgThe distribution of (a); d (x) is a function comprising x;to compriseA function of (a);
the loss function of the improved discriminator is as follows:
wherein:is at the same timeAnd the value of the random interpolation sample on the connecting line of x;is composed ofThe distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty;is composed ofA 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generates data to be judged as real data, the second part is the expectation of the probability that the real data is judged as the real data, and the third part is the gradient punishment;
step 2.2: network structure for generating countermeasure network based on gradient penalty and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator; the generator is composed of a plurality of layers of perceptrons, the input is Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to an application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation functions of the input layer and the hidden layer are modified linear units ReLU; the discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units;
the Sigmoid activation function has better activation near 0, the gradient of the positive saturation area and the negative saturation area is close to 0, gradient dispersion can be caused, and the gradient of the ReLU activation function in the part larger than 0 is a constant;
step 2.3: the training process for generating the countermeasure network based on the gradient penalty and the Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data; the input of the generator is a group of Gaussian distributed noises, which provides a pre-set data distribution for the generator, and the generator can convert the data distribution into a group of virtual data, namely generated data; the input of the discriminator is real data and generated data, and the output of the discriminator is the probability that the input sample is the real data; according to the loss function of the generator, the loss of the generator can be reduced by maximizing the output probability of the discriminator, namely the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data; according to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is; the method is characterized in that the antagonism of a discriminator and a generator in the antagonistic network is generated, and when the discriminator and the generator reach balance, the training process is ended;
step 2.4: generation of the countermeasure network parameter updates based on the gradient penalties and the Wasserstein distance, the process is as follows:
initializing relevant parameters: gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; theta is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; m is the number of samples;
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between;
2.4.2) formula for calculating the discriminator loss is as follows:
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;to compriseA function of (a); dw(x) Is a function containing x;is composed ofA 2-norm of the gradient of the function;
2.4.3) optimizing parameters of the discriminator according to an adaptive moment estimation algorithm (Adam for short), wherein a gradient calculation formula is as follows:
wherein:the gradient of (1) is realized by adopting a batch gradient descent method, and the w parameter is updated by using m samples each time;
2.4.4) repeating steps 1-4ncriticSecondly;
2.4.5) sampling m samples from P (z)According to the Adam gradient descent method, parameters of a generator are optimized, and a gradient calculation formula is as follows:
wherein:is composed ofThe gradient of (2) is obtained by adopting a batch gradient descent method, and the theta parameter is updated by using m samples each time;
the method actively generates modeling data by using the generated countermeasure network, and can generate more data reflecting process characteristics under the condition of less original training data. The method comprises the steps of training a generation countermeasure network based on Wasserstein distance and gradient punishment by using sample data acquired in a multi-brand chemical process, automatically increasing the sample data volume by using a generator in a generative model after the training is finished, combining original training data and generated data into new training data, and inputting the new training data into a traditional soft measurement model so as to improve the accuracy of the model for predicting the key quality quantity of the multi-brand chemical process.
The invention has the following beneficial effects: the method uses the generation countermeasure network to generate data, automatically increases the sample data amount, is simple and convenient, avoids high-cost and long-period data acquisition in the multi-brand chemical process soft measurement modeling, and effectively improves the prediction accuracy of the soft measurement model.
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2a is a graph comparing the prediction error values of the first sample grade in the MLP and AGAN + MLP models;
FIG. 2b is a graph comparing the prediction error values for the second sample grade in both the MLP and AGAN + MLP models;
FIG. 2c is a graph comparing the predicted error values for the first brand sample in both the PLS and AGAN + PLS models;
FIG. 2d is a graph comparing the predicted error values for the second brand samples in both the PLS and AGAN + PLS models;
in fig. 2a, the training data sets of the MLP model and the AGAN + MLP model are 50 pieces of original training data, 50 pieces of original training data plus generated data 400 pieces, and 22 pieces of test data sets, respectively; in fig. 2b, the training data sets of the MLP model and the AGAN + MLP model are 170 pieces of original training data, 170 pieces of original training data plus 100 pieces of generated data, and 41 pieces of test data sets, respectively; in fig. 2c, the training data sets of the PLS model and the AGAN + PLS model are 50 pieces of original training data and 50 pieces of original training data plus 100 pieces of generated data, respectively, and the test data set is 22 pieces; in fig. 2d, the training data sets of the PLS model and the AGAN + PLS model are 170 pieces of original training data and 170 pieces of original training data plus generation data 200 pieces, respectively, and the test data set is 41 pieces.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 2d, a multi-grade chemical process soft measurement modeling method for automatically generating samples includes the following steps:
(1) partitioning multi-brand chemical process data sets
Elements of the multi-grade chemical process data set are high-dimensional vectors, wherein one dimension is a variable to be predicted, and the other dimensions are known variables. The data collected from the multi-grade chemical process is used as original data and divided into a training set and a testing set according to a set proportion so as to facilitate cross validation.
(2) The principle and training of the AGAN model comprise the following steps:
step 2.1: principle for generating countermeasure network based on gradient punishment and Wasserstein distance
The Wasserstein distance is used for measuring the distance between two distributions, and the calculation formula is as follows:
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as a real data x and a generated data y which are obtained by sampling gamma; the | x-y | is the distance between the real data and the generated data;as desired; inf represents a takedown bound; the entire W function is represented as the lower bound taken on the expected value in all the joint distributions, which is defined as the Wasserstein distance.
Since the above formula cannot be solved directly, it is converted into the following formula:
wherein: x obeys PrThe distribution of (a);compliance PgThe distribution of (a); f (x) is a function comprising x,to compriseA function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is oneThe constant K is more than or equal to 0 and any x in the definition domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding KThe upper bound of (c).
Wherein: f. ofw(x) Is a function containing x;to compriseA function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the arbiter and generator (G for short) in the generation of the countermeasure network are as follows:
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);compliance PgThe distribution of (a); d (x) is a packetA function containing x;to compriseAs a function of (c).
The AGAN adds a gradient punishment (GP for short) measure on the basis of the theory to solve the problem of gradient disappearance or gradient explosion existing in the theory. The loss function of the improved discriminator is as follows:
wherein:is at the same timeAnd the value of the random interpolation sample on the connecting line of x;is composed ofThe distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty, generally taken as 10;is composed ofA 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generated data is discriminated as real data, and the second part is the discrimination of the real data as real dataThe third component is the gradient penalty, depending on the probability desired.
Step 2.2: network structure for generating countermeasure network based on gradient punishment and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator. The generator is composed of a plurality of layers of perceptrons, the input is the Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation function of the input layer and the hidden layer is a modified linear unit (called ReLU for short). The discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons is determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units.
The Sigmoid activation function has better activation near 0, the gradient of the positive saturation area and the negative saturation area is close to 0, gradient dispersion can be caused, and the gradient of the ReLU activation function in the part larger than 0 is constant, so that the gradient dispersion phenomenon is avoided. The ReLU activation function also has the advantages of sparsity and increased training speed.
Step 2.3: the training process for generating the countermeasure network based on the gradient penalty and the Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data. The generator inputs a set of gaussian distributed noise, which provides the generator with a pre-determined data distribution, which the generator can convert into a set of virtual data, i.e. the generated data. The input to the discriminator is the true data and the generated data, and the output of the discriminator is the probability that the input sample is true data. According to the generator loss function, maximizing the output probability of the discriminator can reduce the loss of the generator, namely, the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data. According to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is. This is to generate the adversaries of the arbiter and the generator in the adversary network. When the arbiter and generator reach equilibrium, the training process ends at this point.
Step 2.4: a method for generating countermeasure network parameter updates based on gradient penalties and Wasserstein distances, the process is as follows:
relevant parameters are initialized. Gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; theta is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; and m is the number of samples.
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between.
2.4.2) formula for calculating the discriminator loss is as follows:
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;to compriseA function of (a); dw(x) Is a bagA function containing x;is composed of2 norm of gradient of function.
2.4.3) optimizing parameters of the discriminator according to an adaptive moment estimation algorithm (Adam for short), wherein a gradient calculation formula is as follows:
wherein:is composed ofThe gradient of (2) is updated by a batch gradient descent method, and the w parameter is updated by using m samples each time.
2.4.4) repeating steps 1-4ncriticNext, the process is carried out.
2.4.5) sampling m samples from P (z)According to the Adam gradient descent method, parameters of a generator are optimized, and a gradient calculation formula is as follows:
wherein:is composed ofThe gradient of (2) is obtained by a batch gradient descent method, and the theta parameter is updated by using m samples each time.
The self-adaptive moment estimation algorithm is high in calculation efficiency, high in convergence speed, small in calculation memory occupation, suitable for the situation that the data volume is large or the parameters are large, has intuitive explanation on the related hyper-parameters, and does not need to be adjusted under the common condition.
(3) Construction of a New training set
And (3) independently extracting a generator in the AGAN to generate a virtual sample, wherein the generated virtual sample has the same dimension as the original training data and can be added into the original training set as new training data.
(4) Adjusting soft measurement model parameters according to the new training set
Due to the change of the training set, the soft measurement model parameters need to be adjusted to adapt to the new training data. The model is optimized by traversing a given set of parameters using a grid search algorithm to avoid the tedious of manually adjusting the parameters. The generalization ability of statistical analysis and machine learning algorithms on training data independent data sets was evaluated using cross-validation. Finally, the optimal parameters are automatically selected by adopting a method combining a grid search algorithm and ten-fold cross validation, so that the aim of fully utilizing new training data by the soft measurement model is fulfilled.
Example (c): a multi-grade chemical process soft measurement modeling method for automatically generating samples comprises the following steps:
(1) collecting and dividing data set of multi-brand chemical process
The first type of brand data comprises 72 pieces, 50 pieces of brand data are divided into training sets, and 22 pieces of brand data are divided into testing sets. The second type of brand data are 211, 170 of the brand data are divided into training sets, and 41 of the brand data are divided into test sets.
(2) Training AGAN Using Multi-brand chemical Process data
The parameters of the network are initialized, the batch size (batch size) is set to 30, the gaussian distribution with noise of [0,1] is input, and all training sets are used to train AGAN, the network weights are iterated until the loss function converges and the discriminator and the generator reach equilibrium.
(3) Automatically generating multi-grade chemical process data by utilizing AGAN, and constructing a new training set to train a soft measurement model
A generator in AGAN is used, Gaussian noise of [0,1] is input to generate a certain amount of samples as virtual data, the samples and an original training set (real data) are used as driving data and input into a traditional soft measurement model together, and parameters of the soft measurement prediction model are adjusted to adapt to new training data.
(4) Multi-brand chemical process data key quality variable prediction experimental result
The AGAN + MLP and the AGAN + PLS of the method are respectively compared with a single traditional soft measurement model MLP and a single traditional soft measurement model PLS in an experiment, the prediction mean square error is used as an evaluation standard (RMSE for short), and the smaller the value is, the better the value is. The comparison results are shown in tables 1 and 2, which list the experimental results of the multi-brand original training data and the multi-brand generated data for predicting the key quality variables under different traditional soft measurement models. As known from results, the RMSE of the method is smaller than that of the traditional soft measurement model, and the prediction result of the key quality variable of the multi-grade chemical process data is better.
As shown in fig. 2a to 2b, the AGAN + MLP of the present method is compared with a single conventional soft measurement model MLP, and as shown in fig. 2c to 2d, the AGAN + MLP of the present method is compared with a single conventional soft measurement model MLP. The effectiveness of the invention is verified in many ways by setting multiple grades of sample data and different soft measurement models experimentally. The comparison of the proposed method with the prediction error values of a conventional soft measurement model is detailed in fig. 2 a-2 d. As can be seen from the figure, the prediction error value of the method is smaller than that of the traditional soft measurement model, and the accuracy of predicting the key quality variables of the multi-grade chemical process data can be improved. These results show that the AGAN can automatically generate more data representing process characteristics to improve the capability of capturing feature information of a test sample set by a conventional soft measurement model.
Table 1 shows the comparison of the prediction results of the method with the conventional MLP model;
TABLE 1
Table 2 shows the comparison of the prediction results of the present method with those of the conventional PLS model;
TABLE 2
The method of the invention actively generates modeling data by utilizing AGAN, can generate more data which are in accordance with the distribution of the original data set under the condition of less original training data, can be suitable for predicting key quality variables of a multi-brand chemical process, and has universality and universality. Meanwhile, the cost of sample data re-acquisition, such as the cost of manual sample tags, the cost of sample acquisition time, the cost of sample acquisition hardware and the like, is avoided.
The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.
Claims (1)
1. A multi-grade chemical process soft measurement modeling method capable of automatically generating samples is characterized by comprising the following steps:
(1) partitioning multi-brand chemical process data sets
Dividing data acquired from a multi-grade chemical process into a training set and a testing set according to a set proportion as original data so as to facilitate cross validation;
(2) AGAN model principle and training process
Establishing a generation countermeasure network AGAN based on the gradient punishment and the Wasserstein distance, inputting the divided training set into the established generation countermeasure network, and training the network;
(3) construction of a New training set
Generating a virtual sample by using the trained AGAN, and forming a new training set together with the original training set;
(4) adjusting soft measurement model parameters according to the new training set
The new training set is used as driving data to train soft measurement modeling, parameters of a soft measurement model are adjusted to adapt to the new training set, and the trained soft measurement model is used for predicting key quality variables of the multi-grade chemical process;
the process of the step (2) is as follows:
step 2.1: establishing a generation countermeasure network based on the gradient penalty and the Wasserstein distance;
the Wasserstein distance is used to measure the distance between two distributions, and is calculated as follows:
wherein: prDistribution of real data; pgTo generate a distribution of data; II (P)r,Pg) Is PrAnd PgA set of all possible joint distributions combined; (x, y) -gamma are expressed as sampling from gamma to obtain a real data x and a generated data y; the | x-y | is the distance between the real data and the generated data;as desired; inf represents a takedown bound; the whole W function is expressed as a lower bound taken for an expected value in all the joint distributions, and the expected value is defined as a Wasserstein distance;
converting the above calculation formula into the following formula:
wherein: x obeys PrThe distribution of (a);compliance PgThe distribution of (a); f (x) is a function comprising x,to compriseA function of (a); k is the Lipschitz constant of the function f, expressed as a constraint on a continuous function f such that there is a constant K ≧ 0 for any x in the defined domain1And x2All satisfy | f (x)1)-f(x2)|≤K|x1-x2|;||f||LK is less than or equal to K, and the Lipschitz constant of the function f does not exceed K; sup represents taking the upper bound; the whole W function represents 1/K times of the Lipschitz constant of the function f without exceeding KThe upper bound of (c);
wherein: f. ofw(x) Is a function containing x;to compriseA function of (a); f. ofwA series of functions with parameters w, which will be constructed by the network in the generation countermeasure network;
according to the theoretical derivation of Wasserstein distance, the loss functions of the discriminator D and the generator G in the generation countermeasure network are as follows:
wherein: s (D) is a discriminator loss function; s (G) is a generator loss function; prDistribution of real data; pgTo generate a distribution of data; x obeys PrThe distribution of (a);compliance PgThe distribution of (a); d (x) is a function comprising x;to compriseA function of (a);
the loss function of the improved discriminator is as follows:
wherein:is at the same timeAnd the value of the random interpolation sample on the connecting line of x;is composed ofThe distribution of (a); ε is [0,1]A random number in between; λ is the coefficient of the gradient penalty;is composed ofA 2-norm of the gradient of the function; the first part is the expectation of the probability that the generator generates data to be judged as real data, the second part is the expectation of the probability that the real data is judged as the real data, and the third part is the gradient punishment;
step 2.2: network structure for generating countermeasure network based on gradient penalty and Wasserstein distance
The AGAN consists of a generator and a discriminator which are respectively the generator and the discriminator; the generator is composed of a plurality of layers of perceptrons, the input is Gaussian distribution noise meeting the standard, the number of the layers of the perceptrons is determined according to an application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, the dimensionality of the dimensionality input layers of the output layer is the same, and the activation functions of the input layer and the hidden layer are modified linear units ReLU; the discriminator is composed of a plurality of layers of perceptrons, the input is original training data or generated data, the number of layers of the perceptrons needs to be determined according to a specific application object, the first layer is an input layer, the middle layer is a hidden layer, the last layer is an output layer, and the activation functions of the input layer and the hidden layer are correction linear units;
step 2.3: training process for generating countermeasure network based on gradient penalty and Wasserstein distance is as follows
The generator is used for capturing sample data distribution, and the discriminator is used for estimating the probability that one sample data is from real data instead of generated data; the input of the generator is a group of Gaussian distribution noises, a preset data distribution is provided for the generator, and the generator can convert the data distribution into a group of virtual data, namely generated data; the input of the discriminator is real data and generated data, and the output of the discriminator is the probability that the input sample is the real data; according to the loss function of the generator, the loss of the generator can be reduced by maximizing the output probability of the discriminator, namely the capability of the generator is improved, so that the virtual data is closer to the distribution of the real data; according to the loss function of the discriminator, the capability of the discriminator can be improved by minimizing the output probability of the discriminator, so that the higher the accuracy of the discriminator for distinguishing real data from virtual data is; the generation of the antagonism of the arbiter and the generator in the antagonism network is completed when the arbiter and the generator reach the equilibrium;
step 2.4: generation of the countermeasure network parameter updates based on the gradient penalties and the Wasserstein distance, the process is as follows:
initializing relevant parameters: gradient penalty coefficient: λ ═ 10; w is a discriminator parameter; θ is a generator parameter; number of discriminants, n, trained per training generatorcritic(ii) 5; parameters of the adaptive moment estimation: α ═ 0.0001, β1=0,β20.9; m is the number of samples;
2.4.1) x distribution P from the real datarMiddle sampling, z obeys implicit space P (z) defined by the Gaussian distributed noise, and epsilon is [0-1 [ ]]A random number in between;
2.4.2) formula for calculating the discriminator loss is as follows:
wherein: gθ(z) passing z through a generator for data generation for a function containing z; s (D)(i)The calculated loss in the arbiter for the ith data;to compriseA function of (a); dw(x) Is a function containing x;is composed ofA 2-norm of the gradient of the function;
2.4.3) optimizing parameters of a discriminator according to an adaptive moment estimation algorithm Adam, wherein a gradient calculation formula is as follows:
wherein:is composed ofThe gradient of (1) is realized by adopting a batch gradient descent method, and the w parameter is updated by using m samples each time;
2.4.4) repeating steps 1-4ncriticSecondly;
2.4.5) sampling m samples from P (z)According to the Adam gradient descent method, the parameters of the generator are optimized, and the gradient calculation formula is as follows:
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2018103826280 | 2018-04-26 | ||
CN201810382628 | 2018-04-26 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109002686A CN109002686A (en) | 2018-12-14 |
CN109002686B true CN109002686B (en) | 2022-04-08 |
Family
ID=64600767
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810692852.XA Active CN109002686B (en) | 2018-04-26 | 2018-06-29 | Multi-grade chemical process soft measurement modeling method capable of automatically generating samples |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109002686B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111376910B (en) * | 2018-12-29 | 2022-04-15 | 北京嘀嘀无限科技发展有限公司 | User behavior identification method and system and computer equipment |
SG10201900755WA (en) * | 2019-01-28 | 2020-08-28 | Wilmar International Ltd | Methods and system for processing lipid contents of at least one oil sample and simulating at least one training sample, and for predicting a blending formula, amongst others |
CN110310345A (en) * | 2019-06-11 | 2019-10-08 | 同济大学 | A kind of image generating method generating confrontation network based on hidden cluster of dividing the work automatically |
CN110598806A (en) * | 2019-07-29 | 2019-12-20 | 合肥工业大学 | Handwritten digit generation method for generating countermeasure network based on parameter optimization |
CN111028146B (en) * | 2019-11-06 | 2022-03-18 | 武汉理工大学 | Image super-resolution method for generating countermeasure network based on double discriminators |
TWI708190B (en) | 2019-11-15 | 2020-10-21 | 財團法人工業技術研究院 | Image recognition method, training system of object recognition model and training method of object recognition model |
CN111192221B (en) * | 2020-01-07 | 2024-04-16 | 中南大学 | Aluminum electrolysis fire hole image repairing method based on deep convolution generation countermeasure network |
CN111419213A (en) * | 2020-03-11 | 2020-07-17 | 哈尔滨工业大学 | ECG electrocardiosignal generation method based on deep learning |
CN111794741B (en) * | 2020-08-11 | 2023-08-18 | 中国石油天然气集团有限公司 | Method for realizing sliding directional drilling simulator |
CN112597702B (en) * | 2020-12-21 | 2022-07-19 | 电子科技大学 | Pneumatic modeling generation type confrontation network model training method based on radial basis function |
CN112668196B (en) * | 2021-01-04 | 2023-06-09 | 西安理工大学 | Mechanism and data hybrid-driven generation type countermeasure network soft measurement modeling method |
CN112989635B (en) * | 2021-04-22 | 2022-05-06 | 昆明理工大学 | Integrated learning soft measurement modeling method based on self-encoder diversity generation mechanism |
CN113255732B (en) * | 2021-04-29 | 2022-03-18 | 华中科技大学 | Elastic workpiece robot grinding and polishing surface roughness prediction method based on virtual sample |
CN113177078B (en) * | 2021-04-30 | 2022-06-17 | 哈尔滨工业大学(威海) | Approximate query processing algorithm based on condition generation model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451102A (en) * | 2017-07-28 | 2017-12-08 | 江南大学 | A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method |
CN107563510A (en) * | 2017-08-14 | 2018-01-09 | 华南理工大学 | A kind of WGAN model methods based on depth convolutional neural networks |
CN107563155A (en) * | 2017-08-08 | 2018-01-09 | 中国科学院信息工程研究所 | A kind of safe steganography method and device based on generation confrontation network |
-
2018
- 2018-06-29 CN CN201810692852.XA patent/CN109002686B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451102A (en) * | 2017-07-28 | 2017-12-08 | 江南大学 | A kind of semi-supervised Gaussian process for improving self-training algorithm returns soft-measuring modeling method |
CN107563155A (en) * | 2017-08-08 | 2018-01-09 | 中国科学院信息工程研究所 | A kind of safe steganography method and device based on generation confrontation network |
CN107563510A (en) * | 2017-08-14 | 2018-01-09 | 华南理工大学 | A kind of WGAN model methods based on depth convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
Improved Training of Wasserstein GANs;Gulrajani 等;《http://arxiv.org/abs/1704.00028v1》;20170331;第1-11页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109002686A (en) | 2018-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109002686B (en) | Multi-grade chemical process soft measurement modeling method capable of automatically generating samples | |
CN109992921B (en) | On-line soft measurement method and system for thermal efficiency of boiler of coal-fired power plant | |
CN111144552B (en) | Multi-index grain quality prediction method and device | |
CN109558677A (en) | A kind of hot rolling strip crown prediction technique based on data-driven | |
CN113469470B (en) | Energy consumption data and carbon emission correlation analysis method based on electric brain center | |
CN109472397B (en) | Polymerization process parameter adjusting method based on viscosity change | |
CN109934422B (en) | Neural network wind speed prediction method based on time series data analysis | |
CN109884892A (en) | Process industry system prediction model based on crosscorrelation time lag grey correlation analysis | |
CN110443417A (en) | Multiple-model integration load forecasting method based on wavelet transformation | |
CN106656357B (en) | Power frequency communication channel state evaluation system and method | |
CN113822499B (en) | Train spare part loss prediction method based on model fusion | |
CN113408869A (en) | Power distribution network construction target risk assessment method | |
CN112308298B (en) | Multi-scenario performance index prediction method and system for semiconductor production line | |
CN110264079A (en) | Hot-rolled product qualitative forecasting method based on CNN algorithm and Lasso regression model | |
CN112364560A (en) | Intelligent prediction method for working hours of mine rock drilling equipment | |
CN115982141A (en) | Characteristic optimization method for time series data prediction | |
CN110110447B (en) | Method for predicting thickness of strip steel of mixed frog leaping feedback extreme learning machine | |
CN108073978A (en) | A kind of constructive method of the ultra-deep learning model of artificial intelligence | |
CN116088453A (en) | Production quality prediction model training method and device and production quality monitoring method | |
CN114692507A (en) | Counting data soft measurement modeling method based on stacking Poisson self-encoder network | |
CN109920489A (en) | It is a kind of that model and method for building up are hydrocracked based on Lasso-CCF-CNN | |
CN117370766A (en) | Satellite mission planning scheme evaluation method based on deep learning | |
CN105354644A (en) | Financial time series prediction method based on integrated empirical mode decomposition and 1-norm support vector machine quantile regression | |
CN116662925A (en) | Industrial process soft measurement method based on weighted sparse neural network | |
CN108073979A (en) | A kind of ultra-deep study of importing artificial intelligence knows method for distinguishing for image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |