CN110929786B - Data augmentation method and electronic equipment - Google Patents

Data augmentation method and electronic equipment Download PDF

Info

Publication number
CN110929786B
CN110929786B CN201911157799.4A CN201911157799A CN110929786B CN 110929786 B CN110929786 B CN 110929786B CN 201911157799 A CN201911157799 A CN 201911157799A CN 110929786 B CN110929786 B CN 110929786B
Authority
CN
China
Prior art keywords
data
augmentation
working condition
sample
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911157799.4A
Other languages
Chinese (zh)
Other versions
CN110929786A (en
Inventor
蔺思宇
杨晨旺
马君
刘涛
李素洁
王伟
史超
周景源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meifang Science And Technology Beijing Co ltd
Original Assignee
Meifang Science And Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meifang Science And Technology Beijing Co ltd filed Critical Meifang Science And Technology Beijing Co ltd
Priority to CN201911157799.4A priority Critical patent/CN110929786B/en
Publication of CN110929786A publication Critical patent/CN110929786A/en
Application granted granted Critical
Publication of CN110929786B publication Critical patent/CN110929786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Eyeglasses (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a data augmentation method and electronic equipment, wherein the method comprises the following steps: acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data; inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data; the trained working condition data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training. The data augmentation model which is similar to the equipment large sample label data is constructed by starting from the equipment large sample data, and then the data augmentation model is adjusted according to the small sample working condition label data, so that the small sample working condition label data can be augmented, and the distribution space of the small sample augmentation data is more similar to the distribution breadth of the large sample data.

Description

Data augmentation method and electronic equipment
Technical Field
The present invention relates to the field of industrial information technologies, and in particular, to a data augmentation method and an electronic device.
Background
With the increasing level of modern industrial automation, the scale of modern industrial systems is continuously enlarged, the complexity of cooperation among all parts of the systems is continuously increased, and once some parts of the industrial systems are out of order, the whole systems cannot work normally, and huge shutdown losses are caused.
Industrial mechanical equipment is used as long-term operation equipment, the service life of the industrial mechanical equipment is generally specified to be not less than 20 years, and the industrial mechanical equipment continuously operates for not less than three years, so that under the condition that the industrial mechanical equipment leaves a factory qualified, a large number of working conditions are difficult to occur in actual scene operation of the industrial mechanical equipment, and data with working condition labels are not easy to acquire in the actual operation process of the industrial mechanical equipment. The working condition diagnosis models of many industrial mechanical equipment are trained under the condition of a small sample of the working condition, so that the model is mostly judged to be normal in classification due to excessive normal data, and the missing report rate of the model is increased.
In addition, a method for carrying out data augmentation on signal data of one-dimensional industrial mechanical equipment with a small sample is lacked in the prior art, and only a certain amount of data is ensured, the accuracy of a working condition diagnosis model of the industrial mechanical equipment can be effectively improved.
How to amplify industrial machine signal data has become a major issue in the industry.
Disclosure of Invention
The embodiment of the invention provides a data augmentation method and electronic equipment, which are used for solving the technical problems in the background technology or at least partially solving the technical problems in the background technology.
In a first aspect, an embodiment of the present invention provides a data augmentation method, including:
acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data;
inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data;
the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
More specifically, before the step of acquiring the operating mode tag data, the method further includes:
splitting industrial original data into real tag non-working condition data and working condition tag data;
and cleaning abnormal points in the real tag non-working condition data to obtain real tag normal data.
More specifically, the trained operating mode data augmentation model comprises a trained generator and a trained arbiter.
More specifically, before the step of inputting the operating condition label data and the initialization data of the augmentation data into the trained operating condition data augmentation model, the method further comprises:
acquiring normal data of a real tag, and manufacturing initialization sample augmentation data of any number of random data with the same size as the normal data of the real tag;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation false data with false labels, taking the sample augmentation false data with false labels as the initialized sample augmentation data, inputting the initialized sample augmentation data into the generator in the data augmentation model again to train until the loss function of the generator reaches stable convergence, and obtaining a trained generator;
mixing sample augmentation false data with false labels and real label normal data to input a discriminator in a data augmentation model, and obtaining a trained discriminator when a loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained arbiter.
More specifically, before the step of inputting the operating condition label data and the initialization data of the augmentation data into the trained operating condition data augmentation model, the method further comprises:
acquiring normal data of a real tag, and manufacturing initialization sample augmentation data of any number of random data with the same size as the normal data of the real tag;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation false data with false labels, taking the sample augmentation false data with false labels as the initialized sample augmentation data, inputting the initialized sample augmentation data into the generator in the data augmentation model again to train until the loss function of the generator reaches stable convergence, and obtaining a trained generator;
mixing sample augmentation false data with false labels and real label normal data to input a discriminator in a data augmentation model, and obtaining a trained discriminator when a loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained arbiter.
More specifically, the generator in the data augmentation model is composed of a convolutional neural network encoder and a convolutional neural network decoder in combination.
More specifically, the loss function of the generator is specifically:
according to a convolutional neural network encoder, calculating real tag normal data and initialization sample augmentation data respectively, extracting real tag normal data feature vectors and initial augmentation feature vectors, and obtaining a first loss function according to mean square errors of the real tag normal data feature vectors and the initial augmentation feature vectors;
decoding the initial augmentation feature vector according to the convolutional neural network decoding to obtain sample augmentation false data, and obtaining a second loss function according to the mean square error of the sample augmentation false data and real label normal data;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and respectively weighting the first loss function, the second loss function and the third loss function to obtain the loss function of the generator.
More specifically, the loss function of the generator is specifically:
according to a convolutional neural network encoder, calculating real tag normal data and initialization sample augmentation data respectively, extracting real tag normal data feature vectors and initial augmentation feature vectors, and obtaining a first loss function according to mean square errors of the real tag normal data feature vectors and the initial augmentation feature vectors;
decoding the initial augmentation feature vector according to the convolutional neural network decoding to obtain sample augmentation false data, and obtaining a second loss function according to the mean square error of the sample augmentation false data and real label normal data;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and respectively weighting the first loss function, the second loss function and the third loss function to obtain the loss function of the generator.
In a second aspect, an embodiment of the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the data augmentation method of the first aspect when the program is executed.
In a third aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data augmentation method of the first aspect.
According to the data augmentation method and the electronic device, the data augmentation model which is similar to the large sample label data of the device is constructed by starting from the large sample data of the device, and then the data augmentation model is adjusted according to the small sample working condition label data, so that the small sample working condition label data can be augmented, the distribution space of the small sample augmentation data is enabled to be closer to the distribution breadth of the large sample data, the data augmentation model is enabled to reasonably expand the distribution breadth of the small sample, and the augmentation data credibility at the distribution edge of the small sample is guaranteed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data augmentation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data amplifying device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
FIG. 1 is a flow chart of a data augmentation method according to an embodiment of the present invention, as shown in FIG. 1, comprising:
step S1, acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data;
s2, inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain the augmentation working condition data;
the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The working condition label data described in the embodiment of the invention refer to fault information in industrial mechanical equipment, and the fault information also comprises a working condition label.
The initialization data of the augmentation data described in the embodiment of the invention refers to randomly generated data, and the data size of the data is consistent with the working condition label data and is used as the initialization data of the augmentation data.
The trained working condition data augmentation model described in the embodiment of the invention is applied to the field of modern industrial machinery, and can input working condition label data into the trained working condition data augmentation model to augment the working condition label data to obtain augmented working condition data aiming at the problem of fewer working condition samples of an automatic learning model of industrial machinery.
The augmented working condition data described in the embodiment of the invention are a plurality of extended working condition data, and the accuracy of an automatic learning model aiming at industrial mechanical equipment can be effectively improved when the augmented working condition data are obtained.
The trained working condition data augmentation model described in the embodiment of the invention is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The real tag normal data described herein is data of a large data volume with a real data tag, and the real data tag described in the embodiment of the present invention represents data that is not randomly generated but is obtained from real data. The spurious labels described in the embodiments of the present invention refer to data randomly generated by a generator in a data expansion model,
firstly, training to obtain a model capable of realizing data augmentation of a large data sample through initializing samples of normal data of a real tag and random data with random numbers consistent with the normal data of the real tag, and training the base model through working condition data by taking the model as the base model, so as to obtain a trained working condition data augmentation model capable of expanding the working condition data.
According to the embodiment of the invention, the data augmentation model which can augment the similarity of the equipment large sample label data is constructed from the equipment large sample data, and then the data augmentation model is adjusted according to the small sample working condition label data, so that the small sample working condition label data can be augmented, the reasonable expansion of the distribution breadth of the small sample by the data augmentation model is ensured, and the augmented data credibility at the distribution edge of the small sample is ensured.
On the basis of the foregoing embodiment, before the step of acquiring the operating mode tag data, the method further includes:
splitting industrial original data into real tag non-working condition data and working condition tag data;
and cleaning abnormal points in the real tag non-working condition data to obtain real tag normal data.
Specifically, the industrial raw data described in the embodiments of the present invention is raw data directly extracted from an industrial machinery system.
The real tag non-working condition data described in the embodiment of the invention refers to real large sample data in an industrial mechanical equipment system, and the working condition tag data refers to small sample real working condition data in the industrial mechanical equipment system.
According to the embodiment of the invention, the data reliability is improved by cleaning the abnormal points in the real tag non-working condition data, and the real tag non-working condition data and the working condition tag data are distinguished; therefore, model training can be effectively performed on working condition data.
On the basis of the embodiment, the trained working condition data augmentation model comprises a trained generator and a trained arbiter.
Before the step of inputting the operating condition label data and the initialization data of the augmentation data into the trained operating condition data augmentation model, the method further comprises:
acquiring normal data of a real tag, and manufacturing initialization sample augmentation data of any number of random data with the same size as the normal data of the real tag;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation false data with false labels, taking the sample augmentation false data with false labels as the initialized sample augmentation data, inputting the initialized sample augmentation data into the generator in the data augmentation model again to train until the loss function of the generator reaches stable convergence, and obtaining a trained generator;
mixing sample augmentation false data with false labels and real label normal data to input a discriminator in a data augmentation model, and obtaining a trained discriminator when a loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained arbiter.
Specifically, the data format of the initialization sample augmentation data is consistent with the normal data size of the real tag, wherein the data is randomly generated in any amount by the initialization sample augmentation data.
The method comprises the steps of respectively calculating the initialization sample augmentation data and the real tag normal data by using a convolutional neural network encoder to obtain an initial augmentation feature vector and a real tag normal data feature vector, and calculating the mean square error of the real tag normal data feature vector and the initial augmentation feature vector to obtain a first loss function;
decoding the initial augmentation feature vector by using a convolutional neural network decoder to obtain sample augmentation false data, and calculating the mean square error of the sample augmentation false data and real tag normal data to obtain a second loss function;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and taking the convolutional neural network encoder and the convolutional neural network decoder as generators of a data augmentation model, and then weighting the first loss function, the second loss function and the third loss function respectively to obtain the loss functions of the generators.
And in the initial training stage, inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation false data with false labels, if the loss function of the generator does not stably converge, continuing to perform cyclic training, and at the moment, replacing the initialized sample augmentation false data with the false labels output by the generator with the sample augmentation false data as input, performing cyclic training until the loss function of the generator stably converges, and stopping training to obtain the trained generator.
In each round of training process of the generator, the output of the generator is used as the input in the subsequent training process of the generator again, the output of the generator is used as the output of the discriminator, meanwhile, the input of the discriminator also comprises real label normal data, the discriminator carries out multiple training along with the cyclic training of the generator, the two steps are carried out alternately, the discriminator is trained once after training the generator once, the loss function of the discriminator reaches stable convergence, and finally, the data augmentation model capable of amplifying large sample data is obtained.
According to the embodiment of the invention, the training of the data augmentation model capable of realizing a large amount of sample data can be conveniently guided to the training of the small sample data augmentation model, and the working condition data distribution space of the working condition data augmentation model trained by using the data augmentation model as a base model can be more reasonably expanded, so that the data distribution space is more approximate to the distribution breadth of large sample data.
On the basis of the above embodiment, after the step of obtaining a trained data augmentation model according to the trained generator and trained arbiter, the method further comprises:
acquiring sample working condition label data and manufacturing any number of initialized sample working condition augmentation data consistent with the sample working condition label data in size;
and continuously training the base model according to the initialized sample working condition augmentation data and the sample working condition label data by taking the trained data augmentation model as the base model, and obtaining the trained working condition data augmentation model when the loss function of the base model achieves stable convergence.
Specifically, the data augmentation model is used as a base model, real label normal data are replaced with working condition label data, the base model is trained, and the trained augmentation working condition label data are obtained when the loss function of the base model is stably converged.
According to the embodiment of the invention, the data augmentation model which can augment the similarity of the equipment large sample label data is constructed from the equipment large sample data, and then the data augmentation model is adjusted according to the small sample working condition label data, so that the small sample working condition label data can be augmented, the distribution space of the small sample augmentation data is more similar to the distribution breadth of the large sample data, the reasonable expansion of the distribution breadth of the small sample by the data augmentation model is ensured, and the augmentation data credibility at the distribution edge of the small sample is ensured.
On the basis of the above embodiment, the loss function of the generator is specifically:
according to a convolutional neural network encoder, calculating real tag normal data and initialization sample augmentation data respectively, extracting real tag normal data feature vectors and initial augmentation feature vectors, and obtaining a first loss function according to mean square errors of the real tag normal data feature vectors and the initial augmentation feature vectors;
decoding the initial augmentation feature vector according to the convolutional neural network decoding to obtain sample augmentation false data, and obtaining a second loss function according to the mean square error of the sample augmentation false data and real label normal data;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and respectively weighting the first loss function, the second loss function and the third loss function to obtain the loss function of the generator.
The convolutional neural network is used as a discriminator, a real label or a false label is used as a label, and cross entropy is used as a loss function of the discriminator.
After the augmented operating condition data is obtained, the method further comprises:
and inputting the augmented working condition data into an automatic learning lifting algorithm of industrial mechanical equipment, effectively lifting the accuracy of the automatic learning algorithm, and finally combining the automatic learning algorithm to form the automatic learning lifting algorithm.
The embodiment of the invention starts from the small sample working condition data directly, but starts from the large sample of the same equipment with larger distribution breadth, and then adjusts the small sample, so that the distribution breadth of the small sample can be reasonably expanded by the parameters of the augmentation model, and the classification accuracy of the data points at the distribution edge of the small sample is ensured.
Fig. 2 is a schematic structural diagram of a data amplifying apparatus according to an embodiment of the present invention, as shown in fig. 2, including: an acquisition module 210 and an augmentation module 220; the acquiring module 210 is configured to acquire the working condition tag data, and manufacture any number of random data with a size consistent with the working condition tag data as initialization data of the augmentation data; the augmentation module 220 is configured to input the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmented working condition data; the trained working condition data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The apparatus provided in the embodiments of the present invention is used to execute the above embodiments of the method, and specific flow and details refer to the above embodiments, which are not repeated herein.
According to the embodiment of the invention, the data augmentation model which can augment the similarity of the equipment large sample label data is constructed from the equipment large sample data, and then the data augmentation model is adjusted according to the small sample working condition label data, so that the small sample working condition label data can be augmented, the distribution space of the small sample augmentation data is more similar to the distribution breadth of the large sample data, the reasonable expansion of the distribution breadth of the small sample by the data augmentation model is ensured, and the augmentation data credibility at the distribution edge of the small sample is ensured.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: processor 310, communication interface (Communications Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320, memory 330 accomplish communication with each other through communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the following method: acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data; inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Embodiments of the present invention disclose a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the methods provided by the method embodiments described above, for example comprising: acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data; inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
Embodiments of the present invention provide a non-transitory computer readable storage medium storing server instructions that cause a computer to perform the methods provided by the above embodiments, for example, including: acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data; inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data; the trained data augmentation model is obtained through initialization sample augmentation data of real label normal data and false labels and training.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A method of data augmentation comprising:
acquiring working condition label data, and manufacturing any number of random data with the same size as the working condition label data as initialization data of the augmentation data;
inputting the working condition label data and the initialization data of the augmentation data into a trained working condition data augmentation model to obtain augmentation working condition data;
the trained working condition data augmentation model is obtained through training of an initialization sample augmentation data set of real label normal data and false labels;
wherein, before the step of inputting the operating condition label data and the initialization data of the augmentation data into the trained operating condition data augmentation model, the method further comprises:
acquiring normal data of a real tag, and manufacturing initialization sample augmentation data of any number of random data with the same size as the normal data of the real tag;
inputting the initialized sample augmentation data into a generator in a data augmentation model to obtain sample augmentation false data with false labels, taking the sample augmentation false data with false labels as the initialized sample augmentation data, inputting the initialized sample augmentation data into the generator in the data augmentation model again to train until the loss function of the generator reaches stable convergence, and obtaining a trained generator;
mixing sample augmentation false data with false labels and real label normal data to input a discriminator in a data augmentation model, and obtaining a trained discriminator when a loss function of the discriminator reaches stable convergence;
and obtaining a trained data augmentation model according to the trained generator and the trained arbiter.
2. The data augmentation method of claim 1, wherein prior to the step of obtaining operating condition label data, the method further comprises:
splitting industrial original data into real tag non-working condition data and working condition tag data;
and cleaning abnormal points in the real tag non-working condition data to obtain real tag normal data.
3. The data augmentation method of claim 2, wherein the trained operating mode data augmentation model comprises a trained generator and a trained arbiter.
4. The data augmentation method of claim 1, wherein after the step of obtaining a trained data augmentation model from the trained generator and trained arbiter, the method further comprises:
acquiring sample working condition label data and manufacturing any number of initialized sample working condition augmentation data consistent with the sample working condition label data in size;
and continuously training the base model according to the initialized sample working condition augmentation data and the sample working condition label data by taking the trained data augmentation model as the base model, and obtaining the trained working condition data augmentation model when the loss function of the base model achieves stable convergence.
5. The data augmentation method of claim 1, wherein the generator in the data augmentation model is comprised of a convolutional neural network encoder in combination with a convolutional neural network decoder.
6. The data augmentation method of claim 5, wherein the generator's loss function is specifically:
according to a convolutional neural network encoder, calculating real tag normal data and initialization sample augmentation data respectively, extracting real tag normal data feature vectors and initial augmentation feature vectors, and obtaining a first loss function according to mean square errors of the real tag normal data feature vectors and the initial augmentation feature vectors;
decoding the initial augmentation feature vector according to the convolutional neural network decoder to obtain sample augmentation false data, and obtaining a second loss function according to the mean square error of the sample augmentation false data and real label normal data;
obtaining a third loss function according to the cosine distance between the fast Fourier function transformation result of the sample augmentation false data and the fast Fourier transformation result of the real label normal data;
and respectively weighting the first loss function, the second loss function and the third loss function to obtain the loss function of the generator.
7. The data augmentation method of claim 1, wherein after the step of obtaining the augmented operating condition data, the method further comprises:
and inputting the amplified working condition data into an automatic learning lifting algorithm of industrial mechanical equipment.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the data augmentation method of any one of claims 1 to 7 when said program is executed by said processor.
9. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the data augmentation method of any of claims 1 to 7.
CN201911157799.4A 2019-11-22 2019-11-22 Data augmentation method and electronic equipment Active CN110929786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911157799.4A CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911157799.4A CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Publications (2)

Publication Number Publication Date
CN110929786A CN110929786A (en) 2020-03-27
CN110929786B true CN110929786B (en) 2023-08-01

Family

ID=69850823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911157799.4A Active CN110929786B (en) 2019-11-22 2019-11-22 Data augmentation method and electronic equipment

Country Status (1)

Country Link
CN (1) CN110929786B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464939A (en) * 2021-01-28 2021-03-09 知行汽车科技(苏州)有限公司 Data augmentation method, device and storage medium in target detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108680807A (en) * 2018-05-17 2018-10-19 国网山东省电力公司青岛供电公司 The Diagnosis Method of Transformer Faults and system of network are fought based on condition production
CN109543674A (en) * 2018-10-19 2019-03-29 天津大学 A kind of image copy detection method based on generation confrontation network
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study
CN110119787A (en) * 2019-05-23 2019-08-13 湃方科技(北京)有限责任公司 A kind of rotary-type operating condition of mechanical equipment detection method and equipment
WO2019221654A1 (en) * 2018-05-17 2019-11-21 Tobii Ab Autoencoding generative adversarial network for augmenting training data usable to train predictive models

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10592779B2 (en) * 2017-12-21 2020-03-17 International Business Machines Corporation Generative adversarial network medical image generation for training of a classifier

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108680807A (en) * 2018-05-17 2018-10-19 国网山东省电力公司青岛供电公司 The Diagnosis Method of Transformer Faults and system of network are fought based on condition production
WO2019221654A1 (en) * 2018-05-17 2019-11-21 Tobii Ab Autoencoding generative adversarial network for augmenting training data usable to train predictive models
CN109543674A (en) * 2018-10-19 2019-03-29 天津大学 A kind of image copy detection method based on generation confrontation network
CN109635774A (en) * 2018-12-21 2019-04-16 中山大学 A kind of human face synthesizing method based on generation confrontation network
CN109815928A (en) * 2019-01-31 2019-05-28 中国电子进出口有限公司 A kind of face image synthesis method and apparatus based on confrontation study
CN110119787A (en) * 2019-05-23 2019-08-13 湃方科技(北京)有限责任公司 A kind of rotary-type operating condition of mechanical equipment detection method and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Maayan Frid-Adar等.《SYNTHETIC DATA AUGMENTATION USING GAN FOR IMPROVED LIVER LESION CLASSIFICATION》.《2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)》.2018,第289-293页. *
尤鸣宇等.《基于样本扩充的小样本车牌识别》.《南京师大学报( 自然科学版)》.2019,第第42卷卷(第第42卷期),第1-10页. *

Also Published As

Publication number Publication date
CN110929786A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN110261080B (en) Heterogeneous rotary mechanical anomaly detection method and system based on multi-mode data
CN109685805B (en) Image segmentation method and device
CN114326655A (en) Industrial robot fault data generation method, system, terminal and storage medium
CN110929786B (en) Data augmentation method and electronic equipment
CN116226676B (en) Machine tool fault prediction model generation method suitable for extreme environment and related equipment
CN111488947A (en) Fault detection method and device for power system equipment
CN112463564B (en) Method and device for determining associated index influencing host state
CN112836807A (en) Data processing method and device based on neural network
CN110442439B (en) Task process processing method and device and computer equipment
CN115934484B (en) Diffusion model data enhancement-based anomaly detection method, storage medium and apparatus
CN114972695B (en) Point cloud generation method and device, electronic equipment and storage medium
CN114301719B (en) Malicious update detection method and system based on variational self-encoder
CN111027678B (en) Data migration method and device
CN110990837B (en) System call behavior sequence dimension reduction method, system, equipment and storage medium
CN109409226B (en) Finger vein image quality evaluation method and device based on cascade optimization CNN
CN113837930B (en) Face image synthesis method, device and computer readable storage medium
CN111950233B (en) Code scanning identification method and device, electronic equipment and readable storage medium
CN117632716B (en) Data processing method and device for software security test
CN117827620B (en) Abnormality diagnosis method, training device, training equipment, and recording medium
CN116107859B (en) Container fault prediction method and device, electronic equipment and storage medium
Lee et al. Restoration of Time-Series Medical Data with Diffusion Model
CN115309645A (en) Defect positioning method, device, equipment and storage medium for development and test
CN116432128A (en) Target classification method, device, electronic equipment and storage medium
CN116467567A (en) GAN-based wind driven generator variable pitch bearing data enhancement method and device
CN117274853A (en) Differential depth detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Lin Siyu

Inventor after: Yang Chenwang

Inventor after: Ma Jun

Inventor after: Liu Tao

Inventor after: Li Sujie

Inventor after: Wang Wei

Inventor after: Shi Chao

Inventor after: Zhou Jingyuan

Inventor before: Lin Siyu

Inventor before: Yang Chenwang

Inventor before: Ma Jun

Inventor before: Liu Yongpan

Inventor before: Liu Tao

Inventor before: Li Sujie

Inventor before: Wang Wei

Inventor before: Shi Chao

Inventor before: Zhou Jingyuan

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant