CN113360485A - Engineering data enhancement algorithm based on generation of countermeasure network - Google Patents

Engineering data enhancement algorithm based on generation of countermeasure network Download PDF

Info

Publication number
CN113360485A
CN113360485A CN202110528930.4A CN202110528930A CN113360485A CN 113360485 A CN113360485 A CN 113360485A CN 202110528930 A CN202110528930 A CN 202110528930A CN 113360485 A CN113360485 A CN 113360485A
Authority
CN
China
Prior art keywords
data
generator
discriminator
generation
engineering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110528930.4A
Other languages
Chinese (zh)
Inventor
刘洋
申迎港
王浩成
张茜
蔡宗熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202110528930.4A priority Critical patent/CN113360485A/en
Publication of CN113360485A publication Critical patent/CN113360485A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention discloses an engineering data enhancement algorithm based on a generation countermeasure network, which aims to provide sufficient data information for researchers so as to carry out more accurate research work, and comprises the following steps: acquiring original data, and performing data preprocessing such as 'halt' data processing, noise reduction processing, normalization processing and the like on the original data to obtain a group of smooth and stable construction data; substituting the processed data into the GAN to generate a confrontation network data enhancement algorithm, and performing data enhancement by using the mutual confrontation principle of a generator and a discriminator; and outputting engineering data similar to the original data distribution. The method of the invention realizes the combination of data preprocessing and data enhancement, removes a plurality of groups of useless data including noise, utilizes the countermeasures principle of a generator and a discriminator in a generating countermeasures network, realizes the enhancement of the construction airborne data and solves the problem of data shortage in the research.

Description

Engineering data enhancement algorithm based on generation of countermeasure network
Technical Field
The invention relates to an engineering data enhancement algorithm, in particular to an engineering data enhancement algorithm based on a generation countermeasure network.
Background
With the development of deep learning in recent years, deep neural networks have revolutionized classification tasks. The deep neural network-based classifier can achieve high accuracy on the premise that sufficient label samples are used as training data. However, in some situations, the tagged data is difficult to collect or the data is expensive, time-consuming and labor-consuming to obtain. When the data is insufficient, the neural network is difficult to stably train and has weak generalization capability.
In response to this problem, the Goodfellow professor from the university of Montreal proposed a deep learning method based on generation of a confrontation network (GAN) and applied to solve the problem that neural networks are difficult to train due to data scarcity. After the generation of the countermeasure network, the generation of the countermeasure network attracts a great deal of attention, and students make various improvements on the model. The deep convolution generation countermeasure network proposed by Radford combines generation countermeasure networks with deep convolution. The model removes a full connection layer and a pooling layer in the original network, and both the generator and the discriminator are processed by batch regularization, so that the training time of the model is shortened, and the stability of the generation of the countermeasure network and the generation quality of the picture are greatly improved. Compared with other generation models, the generation of the countermeasure network does not need to make assumptions on the distribution of original data, so that the model is relatively flexible, but the model can also become uncontrollable by the method without modeling in advance, and Mirza introduces corresponding condition variables in the network construction process, thereby providing the condition generation countermeasure network. Where the condition variable may be any information, the model changes from unsupervised to supervised when the condition variable is the corresponding label.
Data enhancement, also referred to as data augmentation, refers to the expansion of data using limited data structures and sizes to extend the value of the limited data. The data enhancement comprises supervised data enhancement and unsupervised data enhancement, the supervised data enhancement can be divided into single-sample data enhancement and multi-sample data enhancement, and the unsupervised data enhancement comprises two directions of generating new data and learning enhancement strategy.
However, GAN generation countermeasure network models are mostly used for enhancing graphic data or language data, and the enhancement of engineering data has not been realized yet, and since field construction data is difficult to obtain, researchers cannot obtain enough engineering data, so that the verification of the models is difficult to perform.
Based on the data enhancement function of the generation countermeasure network, the method is improved on the basis of the original algorithm, is applied to enhancement of construction geological data and TBM operation data, and enhances the original data according to the generation principle of the generation countermeasure network, so that the data are expanded, and preparation is made for the following big data analysis and the coupling relation between mining and searching distribution.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an engineering data enhancement algorithm based on a generation countermeasure network, and solves the problem of research stagnation caused by insufficient engineering data in the prior art.
The technical scheme of the invention is as follows:
an engineering data enhancement algorithm based on a generative confrontation network, comprising the steps of:
step 1: preprocessing original data; abnormal data and noise data in the data are removed, and shutdown data are screened, so that the engineering data are smoother; noise in the data is reduced by noise reduction processing.
Step 2: and carrying out normalization processing on the preprocessed data to avoid errors caused by unit and value differences in the original data.
And step 3: building and training a GAN model; establishing a GAN generation confrontation network data enhancement algorithm, and bringing the data after data preprocessing into a model for training;
and 4, step 4: testing the model; and (4) bringing a plurality of groups of engineering data into the model for training, and comparing and analyzing the result and the original data to obtain an accurate result so as to complete data enhancement of the engineering data.
And 5: optimizing the model; by changing parameters such as learning rate and the form of a loss function, the model is optimized, so that a more accurate result can be generated. (solution to the problems of the prior art)
Further, the specific implementation manner of the cleaning operation of the abnormal data in the data preprocessing process performed in the step 1 is as follows:
performing box separation processing on the original data, screening out abnormal data values through the box separation processing, and deleting abnormal data;
further, for the denoising processing in the step 1, a large amount of noise data exists in the original data, and the smoothing processing of the data is required to be performed by the denoising processing, which mainly includes wavelet transformation and moving average denoising, the invention adopts the moving average method to perform the denoising processing, and the processing of the "shutdown data" is specifically expressed as the processing of each engineering parameter value in a shutdown state, and the specific implementation process is as follows:
the shutdown data in the raw data is subjected to a screening based on the following formula: and judging whether a single construction record is 'halt' data by adopting a formula. Namely when any one of the four main working parameters of the TBM is zero, the current construction record is determined as 'shutdown' data.
P=f(RSP)f(T)f(F)f(V)
In the above formula, RSP is the cutter rotation speed, T is the cutter torque, F is the propulsion force, and V is the propulsion speed.
The function f is defined as follows:
Figure BDA0003066390940000022
further, for the step 2, the data after the data preprocessing is normalized, and all the data are converted into values between 0 and 1, so that a higher accuracy can be achieved, and the specific implementation process is as follows:
suppose a set of data v0={v1,v2,...,vnMaximum inA value vmaxMinimum value of vminAll the data in the data are converted according to the following formula:
Figure BDA0003066390940000031
the result after conversion is the result of data normalization, and the data enhancement effect after normalization is better;
further, for the step 3, the GAN generation confrontation network model mainly comprises a generator G and a discriminator D;
the generator G inputs a group of randomly distributed noises to generate a group of engineering data which accords with the distribution of the original data;
the discriminator D is a classifier with inputs including the raw data and the data generated by the generator G. Outputting a probability value;
the countermeasure principle of the GAN generation countermeasure network data enhancement algorithm is as follows: the aim of the generator is to enable the generated data to cheat the discriminator as much as possible, so that the probability of the discriminator for discriminating the real data is improved as much as possible; the goal of the discriminator is to discriminate the data generated by the generator as accurate as possible as false data, so that the probability of output is as small as possible; the two are mutually confronted, so that the models are mutually promoted until the data generated by the generator successfully cheats the discriminator, and the result output by the generator is the data which accords with the original data distribution. Through the antagonistic training, the sample generated by the generator is closer to the original data sample, and the quality of data enhancement is effectively improved.
The GAN algorithm data enhancement model network structure is mainly divided into a generator G and a discriminator D, and is composed of three neural networks, wherein an input layer, a hidden layer and an output layer of the neural networks are all composed of linear layers, and the enhancement work of original data meeting all conditions can be realized through linear transformation and nonlinear conversion of an activation function.
Further, the generation principle of the GAN generation confrontation network data enhancement model is a maximum and minimum value game, wherein the objective function is as follows:
Figure BDA0003066390940000032
where E () is the loss function, x belongs to the original sample data, and z is the noise data of the data generator G
Further, the step 3 comprises the following steps:
step 3.1: initializing parameters of two neural networks of a generator G and a discriminator D;
step 3.2: n samples are extracted from the raw data and the generator generates the n samples using the defined noise profile. The generator G is fixed and the discriminator D is trained so that the discriminator can discriminate between true and false.
Step 3.3: and (3) carrying out loop iteration training, wherein the discriminator D and the generator G are in mutual confrontation, and under an ideal state, the final discriminator D cannot distinguish whether the data comes from the original data or the data generated by the generator G, and the discrimination probability at the moment is 0.5, so that the training is completed.
Noise data is input into the trained generator model, and through the transformation of the three-layer neural network of the generator G, the model generates generation sample data approximate to the distribution of the original data, so that the data expansion of the original data is realized.
Furthermore, the data dimension in the training process can be set according to the requirement, and due to the one-dimensional property of the engineering data, the original data is regarded as a one-dimensional matrix, and the matrix is sent into a generator to output the one-dimensional matrix.
Further, an activation function is added after each linear layer in the generator G neural network and the discriminator D neural network, if no activation function exists, the linear function is finally equivalent to the linear function, so that the sigmoid activation function is introduced into the algorithm, and the final output value is data within the range of 0-1, and the formula is as follows:
Figure BDA0003066390940000041
furthermore, compared with other minipatch modes for generating an antagonistic network model to be sent into the model, the project data is more prone to setting the batch to be 1, namely all data are brought into training, and therefore a better training effect is achieved;
further, the loss function used in the algorithm is a mean square error function, the mean square error is also called quadratic loss, and L2 loss, the specific form is the sum of the squares of the distances between the original variables and the output values, the final result is that the smaller the loss function is, the better the loss function is, the final result tends to a stable state, and the formula is as follows:
Figure BDA0003066390940000042
wherein xiGenerating data variables, y, for a generator GiIs an original data variable;
further, for the GAN generation confrontation network data enhancement algorithm, a random gradient descent method is adopted to optimize the model, the optimized parameters include weight ω and bias b of a neural network linear layer of a generator G and a discriminator D, and the principle is to solve the partial derivative of the loss function, so that the weight changes towards the direction of the fastest loss function decline, thereby achieving the goal of optimizing the network, wherein x in the formula isi=ω·xi+ b, wherein ω is updated for each sample, and the specific implementation formula is as follows:
Figure BDA0003066390940000044
Figure BDA0003066390940000043
finally, updated ω is obtained, where α is the learning rate, where xiGenerating data variables, y, for a generator GiIs an original data variable;
compared with the prior art, the invention has the beneficial effects that:
1. the method realizes the enhancement of engineering data by utilizing the GAN to generate the countermeasure network, so that the current research is not limited to taking the data from a construction party any more, but a group of data can expand a plurality of groups of data with the same distribution.
2. Compared with interpolation method expansion data and least square method expansion data, the data generated by the GAN generation confrontation network data enhancement model can reflect the characteristics of the original distribution of the data, the generated data is more convenient and simpler, and the scale and the form of the generated data can be freely determined.
3. The GAN algorithm model is divided into a generator G and a discriminator D, the network structure is relatively simple, the whole network structure is formed by linear layers, the network parameters are greatly reduced, and the training difficulty is reduced
4. By using the training method of the antagonistic type, the data generated after the noise data passes through the G neural network of the generator is closer to the original data variable, and the enhanced quality is effectively improved.
Drawings
FIG. 1 is a general flow chart of the data enhancement algorithm set;
fig. 2 is a schematic diagram of a GAN generation countermeasure network data enhancement algorithm.
Detailed Description
The invention is described in further detail below with reference to the following figures and specific examples:
example (b):
an engineering data enhancement method based on generation of a countermeasure network, as shown in fig. 1, includes the following steps:
step 1: preprocessing original data; the 'halt' data and the noise data in the data are removed, so that the engineering data are smoother; noise in the data is reduced by noise reduction processing.
And preprocessing the obtained original data, removing abnormal data and noise data in the original data, screening and deleting useless data when the machine is stopped, wherein the obtained data are engineering data in a working state, the characteristics are more obvious, and the data are smoother.
Performing box diagram processing on the original data, screening out abnormal data values through the box diagram processing, and deleting abnormal data; the abnormal value processing of the box diagram is mainly to carry out screening and clearing processing on data with data values exceeding upper and lower branch bit lines;
the data after the abnormal data is screened is subjected to noise reduction processing, the purpose of the data noise reduction processing is to eliminate high-frequency noise in the original data and further improve the data quality, two general noise reduction processing methods are adopted, namely a wavelet transform method and a moving average method are adopted for noise reduction, the moving average method adopts a sliding window mode to realize smooth processing of the original data, and the data after the moving average method is more beneficial to data enhancement, so that the moving average method is adopted for noise reduction processing of the data.
The data subjected to abnormal value cleaning and noise reduction processing also comprises a large amount of shutdown data, the data to be researched is data in a machine working state, so that the shutdown data needs to be removed, specifically, the shutdown data is represented as processing of various engineering parameter values in a shutdown state, and the specific implementation process is as follows:
and (3) carrying out a screening process on the null-push data in the original data based on the following formula: and judging whether a single construction record is 'null-push' data by adopting a formula. Namely when any one of the four main working parameters of the TBM is zero, the current construction record is determined as 'null-push' data.
P=f(RSP)f(T)f(F)f(V)
In the above formula, RSP is the cutter rotation speed, T is the cutter torque, F is the propulsion force, and V is the propulsion speed.
The function f is defined as follows:
Figure BDA0003066390940000062
namely, each numerical value in the engineering data mainly comprises main operation parameters in a working state, in this case, the shutdown data of the TBM operation data is removed, namely, any data of the four data of the rotating speed of the cutter head, the total thrust of the cutter head torque and the thrust speed is 0, namely, the data represents shutdown data.
Step 2: and carrying out normalization processing on the preprocessed data to avoid errors caused by unit and value differences in the original data.
The values in the original data are different in size and unit, and the improvement and optimization of the model are needed when the data in different ranges are brought into the data enhancement algorithm, on the contrary, the influence caused by the unit can be eliminated by the normalized data, and meanwhile, the values can be normalized in the range of 0-1 without influencing the distribution characteristics of the data. The specific implementation mode is as follows:
setting the original data to be enhanced as v0={v1,v2,...,vnV, maximum value of vmaxMinimum value of vminAll the data in the data are converted according to the following formula:
Figure BDA0003066390940000063
the result after conversion is the result of data normalization, and the result effect of the anti-network data enhancement algorithm generated by substituting the normalized data into the GAN is better;
the processed data set is called target original data, and is brought into a generation countermeasure network for data enhancement.
And step 3: building and training a GAN model; establishing a GAN generation confrontation network data enhancement algorithm, and bringing the preprocessed data into a model for training;
in the embodiment, a group of data sets with the size of [1,4000] is selected from target original data, 1 × 4000 is 4000 data, the sizes of all input noise data of a generator G in the GAN generation countermeasure network are set to 4000, the size of data input of a discriminator D is 4000, and the target original data and sample data generated by the generator are input into the discriminator sequentially. The output data size of the generator G is 4000, and the output data size of the discriminator D is 1, which is a probability value that the generated data conforms to the real data. The generation principle of the GAN generation countermeasure network data enhancement model is a maximum and minimum value game, wherein the objective function is as follows:
Figure BDA0003066390940000071
the objective function of discriminator D is:
Figure BDA0003066390940000072
the objective function of generator G is:
Figure BDA0003066390940000073
where E () is the loss function, x belongs to the original sample data, and z is the noise data of the data generator G
Step 3.1: initializing parameters of two neural networks of a generator G and a discriminator D;
and (c) initializing omega and b in the neural network of the generator G and the discriminator D, namely giving an initial value of the weight and the offset, namely, each of the three layers of neural networks is endowed with an initial weight and an offset. To obtain { (ω)1,b1),(ω2,b2),(ω3,b3)}
Step 3.2: n samples are extracted from the raw data and the generator generates the n samples using the defined noise profile. The generator G is fixed and the discriminator D is trained so that the discriminator can discriminate between true and false.
4000 samples were taken from the original data, and the hidden layer size of the discriminator and the generator was set to 2000, i.e. a matrix of 1,4000 was input, a matrix of 4000,2000 weight after initialization of the parameters, and the hidden layer size of the generator G and the discriminator D was the same.
During each iteration:
1. 4000 sampling points y are selected from the target original data set1,y2,...,y4000The number of sampling points can be adjusted according to the needsIn this example, 4000 is selected;
2. a set of random variables is selected from a random distribution (Gaussian, positive-Taiyang, etc.) with dimensions set to 4000, i.e. { z }1,z2,...,z4000}; 3. taking z in the step 2 as an input, sending the input into a G neural network of a generator to finally obtain a group of generated data, and setting the dimensionality to 4000, namely { x1,x2...x4000I.e. xi=G(zi);
4. Updating parameter ω of discriminator DDTo maximize VDOur goal is to make VDThe larger the better, the more V is obtained according to the following formulaDThe larger the better, the smaller D (g (z)) is, the better, i.e. the score obtained after the sample is generated by the discriminator generator and input to the discriminator D, which is one classifier:
Figure BDA0003066390940000074
Figure BDA0003066390940000075
the 1-4 steps are mainly used for training and updating the parameters of the discriminator D, and the parameters of the general discriminator D need to be trained for a plurality of times
5. Updating generator parameters omegaGTo minimize VG
Figure BDA0003066390940000081
Figure BDA0003066390940000082
Step 3.3: and (3) carrying out loop iteration training, wherein the discriminator D and the generator G are in mutual confrontation, and under an ideal state, the final discriminator D cannot distinguish whether the data comes from the original data or the data generated by the generator G, and the discrimination probability at the moment is 0.5, so that the training is completed.
In each iteration process:
1. and fixing the generator G, updating only the parameters of the discriminator D, and respectively substituting the data of the generator generated samples and the sample data in the target original data into the discriminator D, wherein the target of the discriminator D is that if the input is from a real data set, namely the output numerical value is larger, the input is truer, and the opposite numerical value is smaller, the input is more false
2. Fixing the parameters of the discriminator D, updating the parameters of the generator G, inputting a group of noise vectors into the generator G to obtain a group of outputs, inputting the outputs into the discriminator D to obtain a numerical value, wherein the parameters of the discriminator D at this stage are fixed, and the generator G needs to update the parameters thereof to ensure that the numerical value is better when the numerical value is larger.
And 4, step 4: testing the model; and (4) bringing a plurality of groups of engineering data into the model for training, and comparing and analyzing the result and the original data to obtain an accurate result so as to complete data enhancement of the engineering data.
After the training of the step 3, the GAN generation countermeasure network is trained, the algorithm has the capability of generating engineering data, target original data needing to be enhanced is brought into the algorithm for training, and the output result is the data needing to be enhanced. And substituting the target original data and the data generated by the generator G into the MSE mean square error function, if the value of the finally obtained mean square error function is greater than 0.2, the result is not good, and the GAN generation countermeasure network adjustment needs to be carried out again to achieve the optimal quality.
And 5: optimizing the model; the model is optimized by changing the form of the parameters and the loss function, so that more accurate results can be generated.
The model is optimized by changing the learning rate, the size of a hidden layer and the form of an activation function, and the algorithm is suitable for the enhancement of most engineering data due to different optimal parameters corresponding to each group of data, and the parameters in the GAN generation countermeasure network need to be improved to achieve the optimal effect for a small part of data.
In this embodiment, the data distribution after the engineering data enhancement almost matches the data distribution of the target original data, and the two sets of data are substituted into the mean square error function, so that the obtained result is also less than 0.2, which indicates that the result quality of the slave algorithm can be used in research and analysis, thereby effectively solving the dilemma that data are deficient and data analysis is difficult to continue.
The method carries out data preprocessing on the engineering data, the data preprocessing process is not complex, and mainly comprises a plurality of processes of cleaning abnormal data, reducing noise of the data, cleaning shutdown data and the like, and finally the normalized data is target original data which can be brought into the GAN to generate an anti-network. The method adopts a GAN generation confrontation network data enhancement algorithm which is mainly divided into a generator G and a discriminator D, the network structure is simple, three linear layers are mainly adopted, and each linear layer has an activation function at last, so that the parameter quantity of the network is greatly reduced, and the training difficulty is reduced. And countermeasure training is used, so that the noise data is more approximate to the target original data after being reconstructed by the generator G, and the quality of engineering data enhancement is effectively improved. Different data distribution needs to carry out fine adjustment on a GAN generation countermeasure network model to adapt to different data, but the algorithm disclosed by the invention is used for enhancing most of engineering data, and the enhancing effect is good.
It is obvious that the described embodiment is only one possible embodiment of the invention, not all embodiments, and that all other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the invention, belong to the protection scope of the invention.

Claims (10)

1. An engineering data enhancement algorithm based on a generative confrontation network, characterized by comprising the following steps:
step 1: preprocessing original data; abnormal data and noise data in the data are removed, and shutdown data are screened, so that the engineering data are smoother; reducing noise in the data by noise reduction processing;
step 2: normalization processing is carried out on the preprocessed data, and errors caused by unit and numerical value differences in the original data are avoided;
and step 3: building and training a GAN model; establishing a GAN generation confrontation network data enhancement algorithm, and bringing the data after data preprocessing into a model for training;
and 4, step 4: testing the model; bringing a plurality of groups of engineering data into a model for training, and comparing and analyzing results and original data to obtain accurate results and complete data enhancement of the engineering data;
and 5: optimizing the model; the model is optimized by changing the form of the parameters and the loss function, so that more accurate results can be generated.
2. The engineering data enhancement algorithm based on generation of the countermeasure network according to claim 1, wherein the screening of the "shutdown" data in step 1 is to remove the "shutdown" data from the original data;
and judging whether a single construction record is 'halt' data or not by adopting the following formula, namely determining that the current construction record is 'halt' data when any one of four main working parameters of the TBM is zero.
P=f(RSP)f(T)f(F)f(V)
In the above formula, RSP is the cutter rotation speed, T is the cutter torque, F is the propulsion, and V is the propulsion speed; the function is defined as follows:
Figure FDA0003066390930000011
3. the engineering data enhancement algorithm based on generation of countermeasure network according to claim 1, wherein the step 2 of normalizing the data comprises:
let the data set of the parameters be { V0},VXFor a variable in the dataset, the result after normalization is V:
Figure FDA0003066390930000012
4. the engineering data enhancement algorithm based on generation of confrontation network as claimed in claim 1, wherein said step 3GAN generation of confrontation network model includes two parts of generator model G and discriminator model D; the system is composed of three layers of neural networks, wherein an input layer, a hidden layer and an output layer of the neural network are all composed of linear layers, and the enhancement work of original data meeting all conditions is realized through linear transformation and nonlinear conversion of an activation function;
the generator G inputs a group of random noises, and finally generates a group of engineering data which accords with the distribution of the original data;
the discriminator D is a classifier II, and the input of the discriminator comprises original data and data generated by the generator G; a scalar is output, and the larger the value, the more consistent the data generated by the generator is with the original data.
5. The engineering data enhancement algorithm based on generation of countermeasure network as claimed in claim 4, wherein the principle of GAN generation of countermeasure network is the maximum and minimum value game of generator G and discriminator D, and its objective function is:
Figure FDA0003066390930000021
where E () is the loss function, x is the original sample data, and z is the noise data of the data generator G.
6. The algorithm for enhancing engineering data based on generation of countermeasure network as claimed in claim 4, wherein an activation function is added after each linear layer in the generator G neural network and the discriminator D neural network, a sigmoid activation function is introduced into the algorithm, and the final output value is data in the range of 0-1, and the formula is as follows:
Figure FDA0003066390930000022
7. the engineering data enhancement algorithm based on generation of countermeasure network according to claim 1, wherein the step 3 comprises the following steps:
step 3.1: initializing parameters of two neural networks of a generator G and a discriminator D;
step 3.2: extracting n samples from the raw data, generating n samples by using the defined noise distribution by a generator, fixing the generator G, and training a discriminator D, so that the discriminator can distinguish true from false;
step 3.3: and (3) carrying out loop iteration training, wherein the discriminator D and the generator G are in mutual confrontation, and under an ideal state, the final discriminator D cannot distinguish whether the data comes from the original data or the data generated by the generator G, and the discrimination probability at the moment is 0.5, so that the training is completed.
8. The engineering data enhancement algorithm based on generation of countermeasure network as claimed in claim 1, wherein for step (5), the optimization process is the process of modifying necessary parameters, including but not limited to learning rate, training step number, and loss function form.
9. The engineering data enhancement algorithm based on the generation countermeasure network of claim 1, wherein the loss function used in step 5 is a mean square error function, the mean square error is also called quadratic loss, L2 loss, and the specific form is the sum of distance squares of the original variables and the output values, the final result is that the smaller the loss function is, the better, the final result is to be in a stable state, and the formula is as follows:
Figure FDA0003066390930000023
wherein xiIs generated by a generator GData variable, yiAre raw data variables.
10. The method for enhancing engineering data based on generation of countermeasure network as claimed in claim 1, wherein the optimization of model in step 5 is performed by stochastic gradient descent method, the optimized parameters include weight ω and bias b of linear layer of neural network of generator G and discriminator D, and the principle is to make partial derivative of loss function, so that the weight is changed toward the direction of fastest descent of loss function, thereby achieving the goal of optimizing network, wherein x in the above formula isi=ω·xi+ b, wherein ω is updated for each sample, and the specific implementation formula is as follows:
Figure FDA0003066390930000031
Figure FDA0003066390930000032
finally, updated ω is obtained, where α is the learning rate, where xiGenerating data variables, y, for a generator GiAre raw data variables.
CN202110528930.4A 2021-05-14 2021-05-14 Engineering data enhancement algorithm based on generation of countermeasure network Pending CN113360485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110528930.4A CN113360485A (en) 2021-05-14 2021-05-14 Engineering data enhancement algorithm based on generation of countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110528930.4A CN113360485A (en) 2021-05-14 2021-05-14 Engineering data enhancement algorithm based on generation of countermeasure network

Publications (1)

Publication Number Publication Date
CN113360485A true CN113360485A (en) 2021-09-07

Family

ID=77526492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110528930.4A Pending CN113360485A (en) 2021-05-14 2021-05-14 Engineering data enhancement algorithm based on generation of countermeasure network

Country Status (1)

Country Link
CN (1) CN113360485A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114330924A (en) * 2022-01-10 2022-04-12 中国矿业大学 Complex product change strength prediction method based on generating type countermeasure network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428849A (en) * 2019-07-30 2019-11-08 珠海亿智电子科技有限公司 A kind of sound enhancement method based on generation confrontation network
CN110956591A (en) * 2019-11-06 2020-04-03 河海大学 Dam crack image data enhancement method based on depth convolution generation countermeasure network
CN112001480A (en) * 2020-08-11 2020-11-27 中国石油天然气集团有限公司 Small sample amplification method for sliding orientation data based on generation of countermeasure network
CN112102294A (en) * 2020-09-16 2020-12-18 推想医疗科技股份有限公司 Training method and device for generating countermeasure network, and image registration method and device
CN112462001A (en) * 2020-11-17 2021-03-09 吉林大学 Gas sensor array model calibration method for data amplification based on condition generation countermeasure network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428849A (en) * 2019-07-30 2019-11-08 珠海亿智电子科技有限公司 A kind of sound enhancement method based on generation confrontation network
CN110956591A (en) * 2019-11-06 2020-04-03 河海大学 Dam crack image data enhancement method based on depth convolution generation countermeasure network
CN112001480A (en) * 2020-08-11 2020-11-27 中国石油天然气集团有限公司 Small sample amplification method for sliding orientation data based on generation of countermeasure network
CN112102294A (en) * 2020-09-16 2020-12-18 推想医疗科技股份有限公司 Training method and device for generating countermeasure network, and image registration method and device
CN112462001A (en) * 2020-11-17 2021-03-09 吉林大学 Gas sensor array model calibration method for data amplification based on condition generation countermeasure network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
许恒诚: ""基于深度学习的地铁盾构姿态失准机理与智能预测研究"", 《《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》》 *
魏明乐: ""基于生成对抗网络的地铁盾构超前探测方法"", 《《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114330924A (en) * 2022-01-10 2022-04-12 中国矿业大学 Complex product change strength prediction method based on generating type countermeasure network
CN114330924B (en) * 2022-01-10 2023-04-18 中国矿业大学 Complex product change strength prediction method based on generating type countermeasure network

Similar Documents

Publication Publication Date Title
CN111583263B (en) Point cloud segmentation method based on joint dynamic graph convolution
CN111161315B (en) Multi-target tracking method and system based on graph neural network
CN112434732A (en) Deep learning classification method based on feature screening
CN111275108A (en) Method for performing sample expansion on partial discharge data based on generation countermeasure network
CN116628592A (en) Dynamic equipment fault diagnosis method based on improved generation type countering network
CN114492533A (en) Construction method and application of variable working condition bearing fault diagnosis model
CN113222072A (en) Lung X-ray image classification method based on K-means clustering and GAN
CN111462173B (en) Visual tracking method based on twin network discrimination feature learning
CN113887342A (en) Equipment fault diagnosis method based on multi-source signals and deep learning
CN113360485A (en) Engineering data enhancement algorithm based on generation of countermeasure network
CN114037001A (en) Mechanical pump small sample fault diagnosis method based on WGAN-GP-C and metric learning
CN114152442A (en) Rolling bearing cross-working condition fault detection method based on migration convolutional neural network
CN112556682A (en) Automatic target detection algorithm for underwater composite sensor
CN112286996A (en) Node embedding method based on network link and node attribute information
CN112528554A (en) Data fusion method and system suitable for multi-launch multi-source rocket test data
CN116720057A (en) River water quality prediction method and system based on feature screening and weight distribution
CN112434716B (en) Underwater target data amplification method and system based on condition countermeasure neural network
CN115329821A (en) Ship noise identification method based on pairing coding network and comparison learning
CN115034432A (en) Wind speed prediction method for wind generating set of wind power plant
CN111275447B (en) Online network payment fraud detection system based on automatic feature engineering
CN114550260A (en) Three-dimensional face point cloud identification method based on countermeasure data enhancement
CN113537339A (en) Method and system for identifying symbiotic or associated minerals based on multi-label image classification
CN113159218A (en) Radar HRRP multi-target identification method and system based on improved CNN
Erhua Solar photovoltaic power generation wireless monitoring system based on IOT technology
Zhang et al. Discriminative additive scale loss for deep imbalanced classification and embedding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210907