CN115526891A - Training method and related device for generation model of defect data set - Google Patents

Training method and related device for generation model of defect data set Download PDF

Info

Publication number
CN115526891A
CN115526891A CN202211498083.2A CN202211498083A CN115526891A CN 115526891 A CN115526891 A CN 115526891A CN 202211498083 A CN202211498083 A CN 202211498083A CN 115526891 A CN115526891 A CN 115526891A
Authority
CN
China
Prior art keywords
module
attention
feature
channel
generate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211498083.2A
Other languages
Chinese (zh)
Other versions
CN115526891B (en
Inventor
乐康
张耀
张滨
徐大鹏
曹保桂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Seichitech Technology Co ltd
Original Assignee
Shenzhen Seichitech Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Seichitech Technology Co ltd filed Critical Shenzhen Seichitech Technology Co ltd
Priority to CN202211498083.2A priority Critical patent/CN115526891B/en
Publication of CN115526891A publication Critical patent/CN115526891A/en
Application granted granted Critical
Publication of CN115526891B publication Critical patent/CN115526891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30141Printed circuit board [PCB]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a training method and a related device for a generation model of a defect data set, which are used for improving the acquisition efficiency and the image quality of a defect image. The training method comprises the following steps: acquiring a convolutional neural network model and a defect type label; generating a simulated defect image through a generator and a defect type label; generating a first true and false discrimination value and a second true and false discrimination value through a true and false discriminator; generating a first category discrimination value and a second category discrimination value by a category discriminator; calculating a first loss value, a second loss value and a third loss value according to the first and second true and false distinguishing values and the first and second category distinguishing values; judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not; if the conditions are met, determining that the training of the convolutional neural network model is finished; if the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value.

Description

Training method and related device for defect data set generation model
Technical Field
The embodiment of the application relates to the field of model training, in particular to a training method and a related device for a generation model of a defect data set.
Background
In recent years, with the continuous development of computers, the application range of the convolutional neural network model is rapidly expanded, and the convolutional neural network model relates to manufacturing industry, daily life and the like. Analyzing the type of image is one of the main functions of the convolutional neural network model, and can be applied to identifying defects of articles, such as: defects present on the PCB board are identified during the manufacturing of the PCB board. The capability of learning and training a certain defect image by using the convolutional neural network model is utilized, and the capability of recognizing the existing characteristics of the defect image by using the convolutional neural network model is improved. The deep learning technology is developed vigorously in the field of PCB appearance defect detection as a new technology, and the complexity of a manual design algorithm is avoided due to the fact that sample appearance defect characteristics are independently learned, and the deep learning technology is widely applied to the field of industrial PCB appearance defect detection due to the fact that the deep learning technology has accurate detection performance, high detection efficiency and good generalization performance on various types of PCB appearance defects.
Due to the self algorithm characteristic of deep learning, a large number of appearance defect sample images are needed to be used as a training data set for a neural network to learn the appearance defect characteristics of the sample, and the acquisition of the defect images becomes a large factor for restricting the development and application of the deep learning in the field of PCB appearance defect detection. Each defective picture needs a defective PCB, the number of the defective PCBs is not large, and the defective PCB samples are confidential assets of various manufacturers and are difficult to obtain. There are three types of current acquisition methods for mainstream defect image data sets: the method comprises the following steps of photographing a real object PCB, generating a pseudo-defect image and enhancing a data set.
In the past, aiming at the source of a defect image, the defect image is photographed mainly through a real object PCB, namely, a real PCB with defects is used, and then the defect image is photographed through a high-resolution camera. Moreover, the manufactured PCB has few defects, so that the method has extremely low efficiency.
In order to solve the above problems, a defective image is generated by a data set enhancement technique, which obtains some pictures that look different from the original image by performing various image operations on the original image, such as random rotation, random cropping, random scaling, and gray-scale transformation, but these pictures look the same as the original image in terms of the distribution of computer data, and have substantially no difference, so the effect is not good.
At present, people generate a defect image by generating a pseudo defect image. That is, the pixel characteristics of the simulated defect image are encoded by software, and the computer controls the value of each pixel to generate a false defect image. However, the generation method only imitates the appearance characteristics of the defect image, artificially generates some defect images which look like, and has a huge difference with a real defect image from the viewpoint of statistical distribution of image data, and only a few defect images which reach the quality of the training image exist.
In summary, the efficiency of acquiring the image of the specific defect manually generated at present and the image quality cannot be both considered.
Disclosure of Invention
The application discloses a training method and a related device for a generation model of a defect data set, which are used for improving the image acquisition efficiency and the image quality.
The first aspect of the present application provides a training method for a generative model of a defect data set, including:
acquiring a convolutional neural network model and a defect type label, wherein the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
inputting a group of normal distribution sampling data and a defect type label into a generator to generate a simulated defect image;
inputting the real defect image and the simulated defect image into a true and false discriminator to generate a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image;
inputting the real defect image and the simulated defect image into a category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
calculating a first loss value of a generator, a second loss value of a true-false discriminator and a third loss value of a category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value;
judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
if the conditions are met, determining that the training of the convolutional neural network model is finished;
if the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value.
Optionally, the Generator includes N generation modules, generators and an output module Conv _ out, where N is an integer greater than or equal to 2;
inputting a group of normal distribution sampling data and a defect type label into a generator to generate a simulated defect image, wherein the method comprises the following steps:
carrying out convolution operation on the normal distribution sampling data through a first convolution module to generate a first sampling characteristic;
performing convolution operation on the defect type label through a second convolution module to generate a first label characteristic;
inputting the first sampling characteristic and the first label characteristic into a first generation module Generator to generate a first generation parameter;
inputting the first generation parameter, the first sampling characteristic and the first tag characteristic into a second generation module Generator to generate a second generation parameter;
inputting the N-1 generation parameter, the first sampling characteristic and the first tag characteristic into an Nth generation module Generator to generate an Nth generation parameter;
and (4) restoring and outputting the Nth generation parameter through an output module Conv _ out to generate a simulated defect image.
Optionally, the generation module Generator includes a region pixel Attention module RPA, an Attention Dropout module ADO, at least two channel Attention modules Attention, a deconvolution module, and a convolution module;
inputting the first sampling feature and the first tag feature into a first generation module Generator to generate a first generation parameter, specifically:
performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature;
multiplying the first sampling characteristic and the first intermediate characteristic correspondingly according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic;
performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module;
generating a channel vector for the second intermediate feature by a first channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the second intermediate characteristic channel number by combining the first channel Attention module Attention with the channel vector;
the second intermediate features are multiplied correspondingly according to the channels through the Attention module Attention of the first channel according to the normalized one-dimensional vector, and third intermediate features are generated;
performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module;
generating a channel vector for the third intermediate feature by a second channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimension as the third intermediate characteristic channel number by combining the Attention module Attention with the channel vector;
correspondingly multiplying the third intermediate features according to the channels by a second channel Attention module Attention according to the normalized one-dimensional vector to generate fourth intermediate features;
channel superposition is carried out on the fourth intermediate feature, the first label feature and the first sampling feature, and a fifth intermediate feature is generated;
performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature;
distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature;
and performing characteristic length and width reconstruction on the seventh intermediate characteristic through a second deconvolution module to generate a first generation parameter.
Optionally, the first generation parameter, the first sampling feature, and the first tag feature are input to a second generation module Generator to generate a second generation parameter, which specifically is:
channel superposition is carried out on the first generation parameters and the first sampling characteristics to generate eighth intermediate parameters;
performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature;
correspondingly multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module RPA according to channels to generate a tenth intermediate feature;
performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module;
generating a channel vector for the tenth intermediate feature through a third channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the tenth middle characteristic channel number by combining the Attention module Attention of the third channel with the channel vector;
correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature;
performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module;
generating a channel vector for the eleventh intermediate feature through a fourth channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining the fourth channel Attention module Attention with the channel vector;
correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature;
channel superposition is carried out on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature;
performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature;
distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature;
and performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter.
Optionally, the true-false discriminator is composed of at least one true-false discrimination module and a SigMoid function module, where one true-false discrimination module includes a region pixel attention module RPA, a region channel attention module SKConv, an attention Dropout module ADO, a channel shuffling module CSA, an attention channel pooling module ACD, and a feature compression module FS;
inputting the real defect image and the simulated defect image into a true and false discriminator to generate a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image, comprising:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature;
multiplying the real defect image and the first discrimination feature by a regional pixel attention module (RPA) according to the channel correspondence to generate a second discrimination feature;
performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
distributing attention to different areas of the third distinguishing feature through convolution cores of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the third distinguishing feature through the distributed attention to generate a fourth distinguishing feature;
distributing attention to each neuron corresponding to the fourth discrimination feature through a third attention Dropot module ADO, and setting the neurons with the attention smaller than a third preset threshold to zero to generate a fifth discrimination feature;
performing channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to different regions of the fifth distinguishing feature through convolution kernels of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature;
distributing attention to each neuron corresponding to the sixth distinguishing feature through a fourth attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh distinguishing feature;
distributing attention to each channel of the seventh distinguishing feature through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating an eighth distinguishing feature;
extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data;
inputting the true and false distinguishing data into the next true and false distinguishing module in the true and false discriminator until the SigMoid function module analyzes the true and false distinguishing data output by the last true and false distinguishing module to obtain a first true and false distinguishing value;
and inputting the simulated defect image into a true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
Optionally, the category discriminator includes at least one category discrimination module and a softmax function module, where the category discrimination module includes a region pixel attention module RPA, a region channel attention module SKConv, a channel shuffling module CSA, an attention channel pooling module ACD, and a feature compression module FS;
inputting the real defect image and the simulated defect image into a category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image, comprising:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature;
correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature;
performing channel superposition on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature;
distributing attention to different regions of the eleventh distinguishing feature through convolution cores of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature;
performing channel shuffling on the twelfth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to areas with different sizes of the twelfth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the twelfth distinguishing feature through the distributed attention to generate a thirteenth distinguishing feature;
distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating a fourteenth distinguishing feature;
distributing attention to areas with different sizes of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature;
extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data;
inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value;
and inputting the simulated defect image into a category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module.
Optionally, before inputting a set of normal distribution sampling data and defect type labels into the generator and generating the simulated defect image, after acquiring the convolutional neural network model and the defect type labels, the training method further includes:
acquiring defect label characteristics of a real defect image through an encoder;
calculating and generating a mean set and a variance set through defect label characteristics, wherein hidden space parameters are conditional probability distribution of a real defect image;
and sampling the mean set and the variance set by a re-parameterization technology to generate normal distribution sampling data, wherein the normal distribution sampling data follow the conditional probability distribution of the real defect image.
The first aspect of the present application provides a training apparatus for generating a model of a defect data set, including:
the system comprises a first obtaining unit, a second obtaining unit and a defect type label, wherein the first obtaining unit is used for obtaining a convolutional neural network model and a defect type label, and the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
the first generating unit is used for inputting a group of normal distribution sampling data and defect type labels into the generator to generate a simulated defect image;
the second generation unit is used for inputting the real defect image and the simulated defect image into the true and false discriminator and generating a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image;
a third generation unit, configured to input the real defect image and the simulated defect image into the category discriminator, and generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
a calculating unit, configured to calculate a first loss value of the generator, a second loss value of the true-false discriminator, and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value, and the second category discrimination value;
the judging unit is used for judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
the determining unit is used for determining that the training of the convolutional neural network model is finished when the judging unit determines that the first loss value, the second loss value and the third loss value meet the conditions;
and the updating unit is used for updating the weight of the generator according to the first loss value, the second loss value and the third loss value when the judging unit determines that the first loss value, the second loss value and the third loss value do not meet the conditions, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value.
Optionally, the Generator includes N generating modules Generator and an output module Conv _ out, where N is an integer greater than or equal to 2;
a first generation unit comprising:
the first generation module is used for carrying out convolution operation on the normal distribution sampling data through the first convolution module to generate a first sampling characteristic;
the second generation module is used for performing convolution operation on the defect type label through the second convolution module to generate a first label characteristic;
the third generation module is used for inputting the first sampling characteristic and the first label characteristic into the first generation module Generator to generate a first generation parameter;
the fourth generation module is used for inputting the first generation parameter, the first sampling characteristic and the first tag characteristic into the second generation module Generator to generate a second generation parameter;
the fifth generation module is used for inputting the N-1 generation parameter, the first sampling characteristic and the first tag characteristic into the Nth generation module Generator to generate an Nth generation parameter;
and a sixth generating module, configured to restore and output the nth generating parameter through the output module Conv _ out, so as to generate a simulated defect image.
Optionally, the generating module Generator includes a region pixel Attention module RPA, an Attention Dropout module ADO, at least two channel Attention modules Attention, a deconvolution module and a convolution module;
the third generation module specifically comprises:
performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature;
correspondingly multiplying the first sampling characteristic and the first intermediate characteristic according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic;
performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module;
generating a channel vector for the second intermediate feature by a first channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the second intermediate characteristic channel number by combining the first channel Attention module Attention with the channel vector;
the second intermediate features are multiplied correspondingly according to the channels through the Attention module Attention of the first channel according to the normalized one-dimensional vector, and third intermediate features are generated;
performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module;
generating a channel vector for the third intermediate feature by a second channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimension as the third intermediate characteristic channel number by combining the Attention module Attention of the second channel with the channel vector;
correspondingly multiplying the third intermediate features according to the channels by a second channel Attention module Attention according to the normalized one-dimensional vector to generate fourth intermediate features;
channel superposition is carried out on the fourth intermediate feature, the first label feature and the first sampling feature, and a fifth intermediate feature is generated;
performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature;
distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropot module ADO, and setting the neurons with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature;
and performing feature length and width reconstruction on the seventh intermediate features through a second deconvolution module to generate first generation parameters.
Optionally, the fourth generating module specifically includes:
channel superposition is carried out on the first generation parameters and the first sampling characteristics to generate eighth intermediate parameters;
performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature;
correspondingly multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module RPA according to channels to generate a tenth intermediate feature;
performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module;
generating a channel vector for the tenth intermediate feature through a third channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the tenth middle characteristic channel number by combining the Attention module Attention of the third channel with the channel vector;
correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature;
performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module;
generating a channel vector for the eleventh intermediate feature by a fourth channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining the Attention module Attention with the channel vector;
correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature;
channel superposition is carried out on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature;
performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature;
distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature;
and performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter.
Optionally, the true-false discriminator is composed of at least one true-false discrimination module and a SigMoid function module, where one true-false discrimination module includes a region pixel attention module RPA, a region channel attention module SKConv, an attention Dropout module ADO, a channel shuffling module CSA, an attention channel pooling module ACD, and a feature compression module FS;
the second generating unit specifically comprises:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature;
correspondingly multiplying the real defect image and the first discrimination feature by a regional pixel attention module (RPA) according to a channel to generate a second discrimination feature;
performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
distributing attention to different areas of the third distinguishing feature through convolution cores of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the third distinguishing feature through the distributed attention to generate a fourth distinguishing feature;
distributing attention to each neuron corresponding to the fourth distinguishing feature through a third attention Dropout module ADO, and setting the neuron with the attention smaller than a third preset threshold to zero to generate a fifth distinguishing feature;
performing channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to different regions of the fifth distinguishing feature through convolution kernels of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature;
distributing attention to each neuron corresponding to the sixth discrimination feature through a fourth attention Dropot module ADO, and setting the neurons with the attention smaller than a first preset threshold to zero to generate a seventh discrimination feature;
distributing attention to each channel of the seventh distinguishing feature through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating an eighth distinguishing feature;
extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data;
inputting the true and false discrimination data into a next true and false discrimination module in the true and false discriminator until the SigMoid function module analyzes the true and false discrimination data output by the last true and false discrimination module to obtain a first true and false discrimination value;
and inputting the simulated defect image into a true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
Optionally, the category discriminator includes at least one category discriminating module and a softmax function module, where the category discriminating module includes a region pixel attention module RPA, a region channel attention module SKConv, a channel shuffling module CSA, an attention channel pooling module ACD, and a feature compression module FS;
a third generation unit comprising:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature;
correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature;
channel superposition is carried out on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature;
distributing attention to different size regions of the eleventh distinguishing feature through convolution kernels of different size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature;
performing channel shuffling on the twelfth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to different regions of the twelfth distinguishing characteristic through convolution kernels of different receptive fields in the region channel attention module SKConv, and screening different characteristic channels of the twelfth distinguishing characteristic through the distributed attention to generate a thirteenth distinguishing characteristic;
distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, discarding the channel with the later attention ranking, and generating a fourteenth distinguishing feature;
distributing attention to different regions of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in a region channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature;
extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data;
inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value;
and inputting the simulated defect image into a category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module.
Optionally, before the first generating unit and after the first acquiring unit, the training apparatus further includes:
the second acquisition unit is used for acquiring the defect label characteristics of the real defect image through the encoder;
the fourth generation unit is used for calculating and generating a mean set and a variance set through defect label characteristics, and the hidden space parameters are conditional probability distribution of a real defect image;
and the fifth generating unit is used for sampling the mean set and the variance set by a re-parameterization technology to generate normal distribution sampling data, and the normal distribution sampling data follow the conditional probability distribution of the real defect image.
A third aspect of the present application provides an electronic device, comprising:
the device comprises a processor, a memory, an input and output unit and a bus;
the processor is connected with the memory, the input and output unit and the bus;
the memory holds a program that is called by the processor to perform the first aspect and any optional training method of the first aspect.
A fourth aspect of the present application provides a computer readable storage medium having a program stored thereon, the program, when executed on a computer, performing the method of the first aspect and any optional training method of the first aspect.
According to the technical scheme, the embodiment of the application has the following advantages:
firstly, a convolutional neural network model and a defect type label are obtained, wherein the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator, and the defect type label is a defect label of a target generation image. Inputting a group of normal distribution sampling data and defect type labels into a generator to generate a simulated defect image, namely sampling the normal distribution data, and inputting the sampled data into the generator in combination with the defect type labels, so that the generator generates the simulated defect image according to the type labels and the normal distribution sampling data. Selecting a real defect image, wherein the real defect image and the defect type label correspond to the same defect type, inputting the real defect image and the simulated defect image into a true and false discriminator, and generating a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image. Inputting the real defect image and the simulated defect image into a category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image. And calculating a first loss value of the generator, a second loss value of the true-false discriminator and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value. And judging whether the first loss value, the second loss value and the third loss value meet preset conditions. And if the conditions are met, determining that the training of the convolutional neural network model is finished. If the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value. In this embodiment, the generator generates a simulated defect image according to the type label and the normal distribution sampling data, and then performs true-false discrimination and type discrimination on the corresponding real defect image and simulated defect image, respectively, to reversely update the generator loss value (first loss value), the true-false discriminator loss value (second loss value), and the category discriminator loss value (third loss value), where the true-false discriminator and the category discriminator use the true-false discriminator loss value and the category discriminator loss value to update, respectively. After training is finished, the generator can generate an image only by using the normal distribution sampling data and the defect type label, and the quality of the generated image can simultaneously reach the discrimination standards of a true and false discriminator and a type discriminator, so that the defect image acquisition efficiency and the image quality are improved to a great extent.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the embodiments or the prior art description will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings may be obtained according to these drawings without inventive labor.
FIG. 1 is a schematic diagram of an embodiment of a training method for generating a model of a defect data set according to the present application;
FIG. 2 is a schematic flow chart diagram illustrating an embodiment of a convolutional neural network model network layer in an embodiment of the present application;
FIG. 3 is a schematic structural diagram of another embodiment of a convolutional neural network model network layer in the embodiment of the present application;
FIG. 4 is a schematic structural diagram of another embodiment of a convolutional neural network model network layer in the embodiment of the present application;
FIG. 5-1 is a schematic diagram of an embodiment of a first stage of a training method for a generative model of a defect data set according to the present application;
5-2 are schematic diagrams illustrating an embodiment of a second stage of the training method for generative models of defect data sets according to the present application;
5-3 are diagrams illustrating an embodiment of a third stage of a training method for generative modeling of defect data sets according to the present application;
FIGS. 5-4 are schematic diagrams of an embodiment of a fourth stage of the training method for generative models of defect data sets according to the present application;
5-5 are schematic diagrams illustrating an embodiment of a fifth stage of the training method for generating a model of a defect data set according to the present application;
FIGS. 5-6 are schematic diagrams illustrating an embodiment of a sixth stage of the training method for generative models of defect data sets according to the present application;
FIGS. 5-7 are schematic diagrams of an embodiment of a seventh stage of a training method for a generative model of a defect data set according to the present application;
FIGS. 5-8 are schematic diagrams of an embodiment of an eighth stage of the training method for a generative model of a defect data set according to the present application;
FIG. 6 is a schematic diagram of an embodiment of a training apparatus for generating a model of a defect data set according to the present application;
FIG. 7 is a schematic diagram of another embodiment of the training apparatus for generating a model of a defect data set according to the present application;
fig. 8 is a schematic diagram of an embodiment of an electronic device of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing a relative importance or importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
In the prior art, aiming at the source of a defect image, the defect image is mainly shot through a real object PCB, namely, a real PCB with defects is used, and then the defect image is shot through a high-resolution camera, so that the defect image obtained by the method is the truest, but only a small number of defect images can be obtained by the same display screen. Moreover, defects exist on the manufactured PCB in a few cases, so that the method is extremely low in efficiency. In order to solve the above problems, a defective image is generated by a data set enhancement technique, which obtains some pictures that look different from the original image by performing various image operations on the original image, such as random rotation, random cropping, random scaling, and gray-scale transformation, but these pictures look the same as the original image in terms of the distribution of computer data, and have substantially no difference, so the effect is not good. At present, people generate a defect image by generating a pseudo defect image. That is, the pixel characteristics of the simulated defect image are encoded by software, and the computer controls the value of each pixel to generate a false defect image. However, the generation method only imitates the appearance characteristics of the defect image, and artificially generates some defect images which look like, and the difference from the real defect image is huge from the viewpoint of the statistical distribution of the image data, and only a few defect images which reach the quality of the training image exist. In summary, the efficiency of acquiring the specific defect image artificially generated at present and the image quality cannot be considered at the same time.
Based on the above, the application discloses a training method and a related device for a generative model of a defect data set, which are used for improving the image acquisition efficiency and the image quality.
The technical solutions in the present application will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The method of the present application may be applied to a server, a device, a terminal, or other devices with logic processing capability, and the present application is not limited thereto. For convenience of description, the following description takes an execution subject as an example.
Referring to fig. 1, the present application provides an embodiment of a training method for generative models of defect data sets, including:
101. acquiring a convolutional neural network model and a defect type label, wherein the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
the convolutional neural network model used in this embodiment is a conditional countermeasure generation network (CGAN), and is based on a countermeasure generation network (GAN), and adds an additional auxiliary information to the inputs of the discriminator and the generator under the original network structure, where the auxiliary information may be a defect type label of the data. In this embodiment, the defect type label used in the generator of the convolutional neural network model is the above-mentioned auxiliary information.
Based on such a principle, an objective function of the CGAN can be obtained. Compared with GAN, the overall objective function has no change, only the condition that the input of the discriminator and the random input of the generator are added with the defect type label is added.
The defect type label is a defect type to which the PCB belongs in an image, three common PCB appearance defect pictures of depression, internal fracture and board surface residual glue are obtained in a high-definition photographing mode, and the label is manually set for the three defects, for example, the depression is 1, the internal fracture is 2, and the surface residual glue is 3. And the small images with the three defects are cut out from the whole PCB image in a cutout mode, so that the interference of a large number of useless pixels is avoided, the memory is saved, and the training time is shortened. And carrying out image enhancement processing on the obtained defective pictures, including random rotation, mirror image, gray level transformation and the like, and expanding the number of data sets.
102. Inputting a group of normal distribution sampling data and a defect type label into a generator to generate a simulated defect image;
the terminal inputs a group of normal distribution sampling data and defect type labels into a generator to generate a simulated defect image,
in a conventional generator, the random distribution of the random input z is taken out, and then the random input z is spliced and combined with the conditional input y to form a brand new implicit expression. In the discriminator, both the real data x and the generated data G (z) are input together with the condition y to perform discrimination. In this embodiment, the normal distribution sampling data and the corresponding condition (defect type label) are input to the generator, so that the defect simulation image output by the generator is accompanied by a defect simulation image of a defect type.
103. Inputting the real defect image and the simulated defect image into a true and false discriminator to generate a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image;
the terminal selects an image (real defect image) of the PCB with the defect, inputs the image into a true and false discriminator, and outputs a first true and false discrimination value for the image through the true and false discriminator.
And the terminal inputs the defect simulation image into the true and false discriminator, and outputs a second true and false discrimination value for the defect simulation image through the true and false discriminator.
The true and false judger needs to judge the true and false of the real defect image as 1 as possible, judge the true and false of the simulated defect image as 0 as possible, and continuously update the true and false judger and the generator through the output data, so that the generator can carry out training based on the true and false of the image.
104. Inputting the real defect image and the simulated defect image into a category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
and the terminal inputs the real defect image into the category discriminator and outputs a first category discrimination value through the category discriminator.
The terminal inputs the simulated defect image into the category discriminator and outputs a second category discrimination value through the category discriminator.
We need the class discriminator to discriminate the class of the real defect image as well as the simulated defect image to a preset height, for example, 95 percent, and also update the class discriminator and the generator with the output data so that the generator can train based on the class discriminator's judgment on the class.
105. Calculating a first loss value of a generator, a second loss value of a true-false discriminator and a third loss value of a category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value;
and the terminal calculates a first loss value of the generator, a second loss value of the true-false discriminator and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value.
Simulating a second true and false discrimination value p of the defect image on a true and false discriminator Tg =Dt(G(Z|c))。
First true and false discriminating value p on true and false discriminator of true defect image Tx =Dt(X)。
Simulating a second discrimination value p of the defect image on the class discriminator Cg =Dc(G(Z|c))。
First class discrimination value p on real defect image class discriminator Cx =Dc(X)。
Wherein, X is a real defect image, Z is normal distribution sampling data, c is a defect type label, and G (Z | c) is a simulated defect image.
Wherein, in this embodiment, p Tg = Dt (G (Z | c)) and p Tx Dt (X) loss using BCELoss of two classifications, pCg = Dc (G (Z | c)) and p Cx = Dt (X) CELoss loss using multi-classification (3-classification).
The loss function of the generator is L G =BCELoss(p Tg , 1) + CELoss(p Cg , c)。
The loss function of the true and false discriminator is L Dt =BCELoss(p Tg , 0) + BCELoss(p Tx , 1)。
The class discriminator has a penalty function of L Dc =(1-CELoss(p Cg , c)) + CELoss(p Cx , c)。
The first true and false discrimination value p Tx The second true and false judgment value p Tg First class discrimination value p Cx And the second judgment value p Cg Substituting into BCELoss () and CELoss () in the corresponding loss function, and calculating the loss value.
106. Judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
the terminal judges whether the first loss value, the second loss value and the third loss value meet preset conditions, the first loss value of the generator can be counted firstly to obtain a first loss value change set, whether the loss values generated by the last 10000 times of training meet convergence or not is judged, and if yes, the training of the generator is finished. Or the loss values of the generators are smaller than the preset values under the condition that the convergence is met, the whole training frequency also reaches 100 ten thousand times, and if the loss values of the generators meet the preset values, the training of the generators is determined to be finished.
The true-false arbiter and the class arbiter can also determine whether each completes training according to the above manner.
But the three can not finish training at the same time, and then the next step can be carried out as required. The training of the generator may be completed to meet the preset condition, the training of both the generator and the true-false discriminator may be completed to meet the preset condition, the training of both the generator and the category discriminator may be completed to meet the preset condition, or the training of both the generator and the category discriminator may be completed to meet the preset condition, which is not limited herein.
107. If the conditions are met, determining that the training of the convolutional neural network model is finished;
when the three conditions are met, the convolutional neural network model can be determined to be trained, a generator, a true and false discriminator and a category discriminator can be taken out respectively for application of the corresponding fields, the generator is used for generating a simulated defect image, the true and false discriminator is used for identifying the authenticity of the PCB defect image, and the category discriminator is used for identifying the defect type of the PCB image.
108. And if the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value.
Because the generator, the true and false discriminator and the category discriminator are difficult to complete training at the same time, in this embodiment, if the generator is not only required to complete training, when one of the generators completes training, the weight value may be selected not to be updated, and the weight value may also be selected to be updated continuously, which is not limited herein. For example: when the training of the three is completed to meet the preset condition, after the training of the true and false discriminator is completed and the whole convolutional neural network model continues to be trained, the true and false discriminator does not update the weight, but calculates the loss value update generator and the category discriminator. Until the training is completely completed.
In this embodiment, the terminal updates the weight of the generator three times according to the first loss value, the second loss value, and the third loss value. And then the terminal updates the true and false discriminator according to the second loss value, and the terminal updates the weight of the category discriminator according to the third loss value.
The weight updating of the convolutional neural network model can be performed in various ways, in this embodiment, a small batch stochastic gradient descent method is taken as an example to update the convolutional neural network model, and a formula of a gradient updating manner of batch training is as follows:
Figure 944434DEST_PATH_IMAGE001
n is the batch size (batch size),
Figure 363914DEST_PATH_IMAGE003
is a learning rate (learning rate),
Figure 593557DEST_PATH_IMAGE004
is the current weight value of the current weight,
Figure 414882DEST_PATH_IMAGE005
in order to update the weight value,
Figure 216485DEST_PATH_IMAGE006
updating the subfunction for the weight value, wherein x is a preset value.
Using inverse gradient derivation, referring to fig. 2, fig. 3 is a schematic diagram of a convolutional neural network model network layer.
On the left side is the first layer, also the input layer, which contains two neurons a and b. In the middle is a second layer, also the hidden layer, which contains two neurons c and d. The third layer on the right, also the output layer, contains e and f, marked on each line
Figure 388840DEST_PATH_IMAGE007
Is the weight of the connections between layers.
Figure 543878DEST_PATH_IMAGE007
Represents the jth neuron of the ith layer and outputs a weight corresponding to the kth neuron of the last layer (l-1).
Figure 750869DEST_PATH_IMAGE008
Representing the jth neuron output at layer l.
Figure 208526DEST_PATH_IMAGE009
Representing the jth neuron input at stratum l.
Figure 868177DEST_PATH_IMAGE010
Representing the jth neuron bias at layer l.
W represents a weight matrix, Z represents an input matrix, A represents an output matrix, and Y represents a standard answer.
L represents the number of layers of the convolutional neural network model.
Figure 826906DEST_PATH_IMAGE011
The forward propagation method is to transmit the signal of the input layer to the hidden layer, taking hidden layer node c as an example, and looking backward (in the direction of the input layer) on node c, it can be seen that there are two arrows pointing to node c, so the information of nodes a and b will be transmitted to node c, and each arrow has a certain weight, so for node c, the input signal is:
Figure 747458DEST_PATH_IMAGE012
similarly, the input signal of the node d is:
Figure 766229DEST_PATH_IMAGE013
since the terminal is good at doing tasks with loops, it can be represented by matrix multiplication:
Figure 913177DEST_PATH_IMAGE014
therefore, the output of the hidden layer node after nonlinear transformation is represented as follows:
Figure 285383DEST_PATH_IMAGE015
similarly, the input signal of the output layer is represented as the weight matrix multiplied by the output of the above layer:
Figure 201387DEST_PATH_IMAGE016
similarly, the final output of the output layer node after nonlinear mapping is represented as:
Figure 125481DEST_PATH_IMAGE017
the input signal gets the output of each layer with the help of the weight matrix, and finally reaches the output layer. Therefore, the weight matrix plays a role of a transportation soldier in the process of forward signal propagation and plays a role of starting and starting.
Referring to fig. 3, fig. 3 is a schematic diagram of a convolutional neural network model network layer. The backward propagation method, since gradient descent requires explicit error in each layer to update the parameters, the next focus is on how to backward propagate the error of the output layer to the hidden layer.
Wherein, the errors of the nodes of the output layer and the hidden layer are shown in the figure, the error of the output layer is known, and then the error analysis is carried out on the first node c of the hidden layer. Or on node c, except this time looking forward (in the direction of the output layer), it can be seen that the two blue thick arrows pointing to node c start from nodes e and f, so the error for node c is definitely related to nodes e and f of the output layer. The node e of the output layer has arrows pointing to the nodes c and d of the hidden layer respectively, so that the error of the hidden node e cannot be owned by the hidden node c, but the error of the node f is subject to the principle of distribution according to the labor (distribution according to the weight), and similarly, the error of the node f is subject to the principle, so that the error of the node c of the hidden layer is:
Figure 759724DEST_PATH_IMAGE018
wherein,
Figure 919310DEST_PATH_IMAGE019
and
Figure 955399DEST_PATH_IMAGE020
for the output layer back propagation coefficient, the error for the hidden layer node d is, similarly:
Figure 660181DEST_PATH_IMAGE021
wherein,
Figure 781721DEST_PATH_IMAGE022
and
Figure 620364DEST_PATH_IMAGE023
for the hidden layer back propagation coefficients, to reduce the workload, we can write the form of matrix multiplication:
Figure 370014DEST_PATH_IMAGE024
the matrix is relatively complicated, can be simplified to a forward propagation form, and does not destroy the proportion of the forward propagation form, so that the denominator part can be omitted, and the matrix is formed again as follows:
Figure 635910DEST_PATH_IMAGE025
the weight matrix is actually the transpose of the weight matrix w in forward propagation, so the form is abbreviated as follows:
Figure 979167DEST_PATH_IMAGE026
the output layer errors are passed to the hidden layer with the help of the transposed weight matrix, so that we can update the weight matrix connected to the hidden layer with indirect errors. It can be seen that the weight matrix also acts as a transportation soldier in the back propagation process, but this time the output error of the transport, not the input signal.
Referring to fig. 4, fig. 4 is a schematic diagram of a convolutional neural network model network layer. Next, a chain derivation is performed, which introduces the forward propagation of the input information and the backward propagation of the output error, and then the parameters are updated according to the obtained error.
First of all for w of the hidden layer 11 Updating parameters is performed, before updating let us deduce from back to front until w is foreseen 11 The calculation is as follows:
Figure 493937DEST_PATH_IMAGE027
Figure 973460DEST_PATH_IMAGE028
Figure 800471DEST_PATH_IMAGE029
thus error pair w 11 The partial derivatives are calculated as follows:
Figure 365444DEST_PATH_IMAGE030
the following formula is derived (all values are known):
Figure 811469DEST_PATH_IMAGE031
similarly, error is for w 12 The partial derivatives of (A) are as follows:
Figure 20865DEST_PATH_IMAGE032
likewise, the evaluation formula for w12 is derived:
Figure 894143DEST_PATH_IMAGE033
similarly, the error is biased for the bias as follows:
Figure 71046DEST_PATH_IMAGE034
similarly, the error is biased for the offset as follows:
Figure 789604DEST_PATH_IMAGE035
followed by w for the input layer 11 Updating parameters, and before updating, deriving the parameters from back to front until predicting w of the first layer 11 So far:
Figure 509298DEST_PATH_IMAGE036
Figure 163264DEST_PATH_IMAGE037
thus error vs. w of the input layer 11 The partial derivatives are calculated as follows:
Figure 968409DEST_PATH_IMAGE038
the derivation is as follows:
Figure 615291DEST_PATH_IMAGE039
similarly, the other three parameters of the input layer can be used to calculate their respective partial derivatives by the same method, which is not described herein again. In the case where the partial derivative of each parameter is definite, the gradient descent formula is substituted by:
Figure 923913DEST_PATH_IMAGE040
so far, the task of updating each layer of parameters by using the chain rule has been completed.
After the weights of the convolutional neural network model are updated, one part of the convolutional neural network model is reserved, so that when problems of generalization, overfitting and the like occur in the subsequent training process, the originally stored convolutional neural network model can be used.
After the convolutional neural network model is updated, the original sample can be selected to be input into the convolutional neural network model again for training, or new original samples are synthesized again and input into the convolutional neural network model for training.
In this embodiment, a convolutional neural network model and a defect type label are first obtained, where the convolutional neural network model includes a generator, a true-false discriminator and a category discriminator, and the defect type label is a defect label of a target generated image. Inputting a group of normal distribution sampling data and defect type labels into a generator to generate a simulated defect image, namely sampling the normal distribution data, and inputting the sampled data into the generator in combination with the defect type labels, so that the generator generates the simulated defect image according to the type labels and the normal distribution sampling data. Selecting a real defect image, wherein the real defect image and the defect type label correspond to the same defect type, inputting the real defect image and the simulated defect image into a true and false discriminator, and generating a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image. And inputting the real defect image and the simulated defect image into a category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image. And calculating a first loss value of the generator, a second loss value of the true-false discriminator and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value. And judging whether the first loss value, the second loss value and the third loss value meet preset conditions. And if the conditions are met, determining that the training of the convolutional neural network model is finished. If the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value. In this embodiment, the generator generates a simulated defect image according to the type label and the normal distribution sampling data, and then performs true-false discrimination and type discrimination on the corresponding real defect image and simulated defect image, respectively, to reversely update the generator with the obtained generator loss value (first loss value), the true-false discriminator loss value (second loss value), and the category discriminator loss value (third loss value), and the true-false discriminator and the category discriminator use the true-false discriminator loss value and the category discriminator loss value to update, respectively. After training is finished, the generator can generate an image only by using the normal distribution sampling data and the defect type label, and the quality of the generated image can simultaneously reach the discrimination standards of a true and false discriminator and a type discriminator, so that the defect image acquisition efficiency and the image quality are improved to a great extent.
Referring to fig. 5-1, 5-2, 5-3, 5-4, 5-5, 5-6, 5-7, and 5-8, the present application provides an embodiment of a training method for a generative model of a defect data set, comprising:
501. acquiring a convolutional neural network model and a defect type label, wherein the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
step 501 in this embodiment is similar to step 101 in the previous embodiment, and is not described again here.
502. Acquiring defect label characteristics of a real defect image through an encoder;
503. calculating and generating a mean set and a variance set through defect label characteristics, wherein hidden space parameters are conditional probability distribution of a real defect image;
504. sampling the mean set and the variance set by a re-parameterization technology to generate normal distribution sampling data, wherein the normal distribution sampling data follow the conditional probability distribution of a real defect image;
the method comprises the steps that a terminal extracts defect label features of an original image through an encoder, inputs the defect label features into a hidden space, generates hidden space parameters for the defect label features through the hidden space, the hidden space parameters are conditional probability distribution of the defect image, and the hidden space parameters in the embodiment are a mean value set and a variance set. The implicit spatial parameters are a mean set and a variance set, but a normal distribution calculated by using the mean set and the variance set is needed subsequently, but the normal distribution is not smooth, so that the normal distribution is not conducive.
Because the normal distribution calculated by the mean set and the variance set is not smooth and can not be conducted, the hidden space parameters are sampled by a re-parameterization technology, and normal distribution data which accord with the normal distribution are sampled, wherein the normal distribution data are generated by sampling on the basis of the original hidden space parameters and also have type labels corresponding to the defects, so that the conditional probability distribution of the defect images is followed.
The normal distribution sampling data in the embodiment are from real defect images, so that the normal distribution sampling data can be better combined with defect type labels to generate simulated defect images, the training efficiency is improved, and the training time is shortened.
505. Carrying out convolution operation on the normal distribution sampling data through a first convolution module to generate a first sampling characteristic;
506. performing convolution operation on the defect type label through a second convolution module to generate a first label characteristic;
the terminal firstly performs convolution operation on the normal distribution sampling data and the defect type label, specifically performs convolution operation on the normal distribution sampling data through a first convolution module to generate a first sampling characteristic, and performs convolution operation on the defect type label through a second convolution module to generate a first label characteristic.
507. Performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature;
508. correspondingly multiplying the first sampling characteristic and the first intermediate characteristic according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic;
the block RPA of local pixel attention of this step includes a Batchnorm-DefConv-ReLU, a Batchnorm-DefConv, a SigMoid function block, and a bilinear interpolation block. The BatchNorm-DefConv-ReLU, the BatchNorm-DefConv, the SigMoid function module and the bilinear interpolation module are connected in series in sequence. The BatchNorm-DefConv-ReLU layer and the BatchNorm-DefConv layer both belong to a feature processing layer commonly used in a convolutional neural network, a SigMoid function is a known function, and a bilinear interpolation operation method is also a known algorithm.
The region pixel attention module RPA is used as a first re-attention mechanism, and since a weight is assigned to each region pixel of the first sampling feature, the neural network pays more attention to a region where the first sampling feature is obvious.
Specifically, assuming that the number of input original images is B, the number of channels is C, and the resolution is W × H, the first sampling feature is (B, C, H, W), (B, C, H, W) that needs to pass through the BatchNorm-DefConv-ReLU layer of the local pixel attention module RPA to perform channel compression to (B, C × r, H/2, W/2), where r is <1. And then, reducing the image into (B, C, H/4, W/4) through a BatchNorm-DefConv layer, generating the weight of each pixel value through a SigMoid function module, and finally, reducing the image into new (B, C, H, W) through bilinear interpolation, and multiplying the new (B, C, H, W) of the original image one by one.
509. Performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module;
and the terminal performs convolution processing and channel superposition processing on the second intermediate characteristic through a third convolution module, specifically, the second intermediate characteristic is input into the third convolution module to generate convolution data, and the data and the second intermediate characteristic are subjected to channel superposition.
510. Generating a channel vector for the second intermediate feature through the Attention module Attention of the first channel;
511. outputting a normalized one-dimensional vector with the same dimension as the second intermediate characteristic channel number by combining the Attention of the first channel module with the channel vector;
512. correspondingly multiplying the second intermediate features according to the channels through a first channel Attention module Attention according to the normalized one-dimensional vector to generate third intermediate features;
the Attention mechanism of the channel Attention module Attention is mainly to distribute normalized weights to different feature channels, enhance some channels and suppress other channels, so as to achieve the effect of selecting feature information (defect features).
The channel Attention module Attention includes a global average pooling layer, a 1 × 1Conv-ReLU and a Conv-Sigmoid, and the operation principle of the channel Attention module is described in detail below.
Specifically, the second intermediate feature first passes through a Global average Pooling layer (Global ranking) of the first channel Attention module Attention to generate a channel vector, then passes through a 1 × 1 convolution kernel and a ReLU activation function to perform channel compression, and then passes through the 1 × 1 convolution kernel and a Sigmoid activation function to output a normalized one-dimensional vector with a dimension equal to the number of input feature channels, that is, the Attention weight of each feature channel, and multiplies each channel of the input features by each other to generate a third intermediate feature.
513. Performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module;
the terminal performs convolution processing and channel superposition processing on the third intermediate feature through the fourth convolution module, and the detailed steps are similar to step 509 and are not described herein again.
514. Generating a channel vector for the third intermediate feature by a second channel Attention module Attention;
515. outputting a normalized one-dimensional vector with the same dimension as the third intermediate characteristic channel number by combining the Attention module Attention of the second channel with the channel vector;
516. correspondingly multiplying the third intermediate features according to the channels by a second channel Attention module Attention according to the normalized one-dimensional vector to generate fourth intermediate features;
steps 514 to 516 are similar to steps 510 to 512, and are not described herein. It should be noted that after the second channel Attention module Attention, the convolution module + channel Attention module Attention may be added to make the output characteristics more effective.
517. Channel superposition is carried out on the fourth intermediate feature, the first label feature and the first sampling feature, and a fifth intermediate feature is generated;
and the terminal performs channel superposition on the fourth intermediate feature, the first label feature and the first sampling feature to generate a fifth intermediate feature, so that the features can be fused to the label feature after attention distribution.
518. Performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature;
and the terminal reconstructs the fifth intermediate characteristic and the first label characteristic by using the first deconvolution module to increase the characteristic length and width.
519. Distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropot module ADO, and setting the neurons with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature;
in this embodiment, the attention Dropout module ADO includes BatchNorm-2X 2DefConv-ReLU and BatchNorm-2X 2DefConv-SigMiod.
The attention-based Dropot method is different from a random mode used by general Dropot, and the invention utilizes attention to reserve more important characteristic information, so that the performance and the generalization of a convolutional neural network model are better.
And putting the input sixth intermediate feature into BatchNorm-2 x 2DefConv-ReLU for processing, outputting and inputting the sixth intermediate feature into BatchNorm-2 x 2DefConv-SigMiod, generating an attention matrix with the same size as the original feature, setting the neuron corresponding to the position of the original feature matrix with the attention smaller than the first preset threshold value to zero according to the value of the attention matrix, and outputting a seventh intermediate feature.
520. Performing characteristic length and width reconstruction on the seventh intermediate characteristic through a second deconvolution module to generate a first generation parameter;
and the terminal performs feature length and width reconstruction on the seventh intermediate feature through the second deconvolution module to generate a first generation parameter, which is similar to step 518 and is not described herein again.
521. Channel superposition is carried out on the first generation parameters and the first sampling characteristics to generate eighth intermediate parameters;
and when the terminal inputs the first generation parameter into the second generation module Generator, the terminal performs channel superposition on the first generation parameter and the first sampling characteristic to generate an eighth intermediate parameter.
522. Performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature;
523. correspondingly multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module RPA according to channels to generate a tenth intermediate feature;
524. performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module;
525. generating a channel vector for the tenth intermediate feature through a third channel Attention module Attention;
526. outputting a normalized one-dimensional vector with the same dimension as the tenth middle characteristic channel number by combining the third channel Attention module Attention with the channel vector;
527. correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature;
528. performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module;
529. generating a channel vector for the eleventh intermediate feature by a fourth channel Attention module Attention;
530. outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining the fourth channel Attention module Attention with the channel vector;
531. correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature;
532. channel superposition is carried out on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature;
533. performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature;
534. distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature;
535. performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter;
steps 522 to 535 are similar to steps 507 to 520, and are not repeated herein. It should be noted that.
536. Inputting the N-1 generation parameter, the first sampling characteristic and the first tag characteristic into an Nth generation module Generator to generate an Nth generation parameter;
step 536 is similar to steps 522 to 535 or 507 to 520, and is not described herein.
537. The Nth generation parameter is subjected to reduction output through an output module Conv _ out to generate a simulated defect image;
the terminal restores the Nth generation parameter to an image through a convolution nerve output module (Conv _ out) to generate a simulated defect image, and specifically, the Nth generation parameter is restored to a 3-channel image with the original size by using a 3 x 3 convolution.
538. Performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature;
539. correspondingly multiplying the real defect image and the first discrimination feature by a regional pixel attention module (RPA) according to a channel to generate a second discrimination feature;
steps 538 to 539 are similar to steps 507 to 508, and are not described herein.
540. Performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
541. distributing attention to different regions of the third distinguishing characteristic through convolution kernels of different-size receptive fields in a region channel attention module SKConv, and screening different characteristic channels of the third distinguishing characteristic through the distributed attention to generate a fourth distinguishing characteristic;
and the terminal performs channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature, distributes attention to different regions of the third discrimination feature through convolution cores of different-size receptive fields in the region channel attention module SKConv, and screens different feature channels of the third discrimination feature through the distributed attention to generate a fourth discrimination feature.
Specifically, the regional channel attention module SKConv is used for processing the convolution kernel perception field attention and the feature channel attention of the features, the steps are that attention is distributed to regions with different sizes of the features through convolution kernels of different perception fields, different channels are screened through the channel attention, the coding effect of a convolution neural network on input features (third distinguishing features) is further improved, a Resnet structure is added, circulation of front and rear layer features is enhanced, and gradient disappearance and gradient explosion are prevented.
The region channel attention module SKConv comprises at least two deformable convolution kernels with different receptive fields, a first feature superposition module, a feature global average pooling module, a channel restoration module, at least two Sofamax modules and a second feature superposition module.
For example: and respectively extracting the features of the third discriminant feature by using deformable convolution kernels of 3 different receptive fields, and obtaining 3 features (receptive field feature sets). And processing the receptor field feature set by a first feature superposition module according to a channel superposition mode to form a feature (B, 3 × C, H, W). Compressing and globally averaging and pooling the characteristics (B, 3 × C, H, W) by a characteristic global averaging and pooling module to form characteristics (B, 3C ', 1), restoring the characteristics (B, 3C', 1) to intermediate characteristics (B, 3C, 1) by a channel restoring module, distributing attention to the whole channel of the intermediate characteristics by a Sofamax module, dividing the attention into three parts, wherein the three parts of the attention correspond to the previous 3 deformable convolution kernels respectively, multiplying the attention of the channel by the corresponding channel of the receptive field characteristic set output by the previous 3 deformable convolution kernels, adding the attention of the channel by the corresponding elements of the channel by a second characteristic superposition module, and superposing the resultant with the second characteristics by the channel.
The deformable convolution means that a parameter direction parameter is additionally added to each element of the convolution kernel, so that the convolution kernel can be expanded to a large range in the training process. The traditional convolution has poor adaptability to unknown changes and weak generalization capability. The deformable convolution changes a fixed rectangular convolution frame of the traditional convolution, and can better adapt to the characteristics that the shape is not regular, such as the appearance defect of a display screen, so that the position of the convolution can be concentrated at the pixel where the defect is located.
542. Distributing attention to each neuron corresponding to the fourth distinguishing feature through a third attention Dropout module ADO, and setting the neuron with the attention smaller than a third preset threshold to zero to generate a fifth distinguishing feature;
in this embodiment, step 542 is similar to step 519, and is not described herein again.
543. Performing channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA;
and the terminal performs channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA, and the specific terminal enhances inter-channel characteristic fusion and retains original characteristic information by performing channel shuffling on the fifth distinguishing characteristic and superposing the original fifth distinguishing characteristic.
544. Distributing attention to different regions of the fifth distinguishing feature through convolution kernels of different receptive fields in the region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature;
545. distributing attention to each neuron corresponding to the sixth distinguishing feature through a fourth attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh distinguishing feature;
in this embodiment, steps 544 to 545 are similar to steps 541 and 542 described above, and are not described herein again.
546. Distributing attention to each channel of the seventh distinguishing features through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating eighth distinguishing features;
attention channel pooling module ACD includes one global average pooling layer, 1 + 1Conv + ReLU and one 1 + 1Conv + SigMoid.
And the seventh discrimination feature is subjected to global average pooling, 1 × 1Conv + ReLU and 1 × 1Conv + SigMoid generate the attention of each channel, then the feature channels are sorted according to the attention, and the channels with ranked attention are discarded, so that the eighth discrimination feature is generated.
547. Extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data;
the Feature compression module is also called Feature Squeeze module, which extracts Feature information from the eighth distinguishing Feature output by the previous layer by convolution and compresses the length and width to generate true and false distinguishing data.
548. Inputting the true and false distinguishing data into the next true and false distinguishing module in the true and false discriminator until the SigMoid function module analyzes the true and false distinguishing data output by the last true and false distinguishing module to obtain a first true and false distinguishing value;
the true and false discriminator comprises at least two true and false discriminating modules, wherein the output of one true and false discriminating module is input into the next true and false discriminating module until the last true and false discriminating module finishes processing the characteristics.
In this embodiment, the terminal inputs the true and false discrimination data into the next true and false discrimination module in the true and false discriminator until the SigMoid function module analyzes the true and false discrimination data output by the last true and false discrimination module to obtain the first true and false discrimination value.
549. Inputting the simulated defect image into a true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module;
and the terminal inputs the simulated defect image into a true and false discriminator and outputs a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
550. Performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature;
551. correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature;
steps 550 to 551 are similar to steps 507 and 508, and are not described herein again.
552. Performing channel superposition on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature;
and the terminal performs channel superposition on the tenth distinguishing feature and the first label feature, performs feature fusion and generates an eleventh distinguishing feature.
553. Distributing attention to different size regions of the eleventh distinguishing feature through convolution kernels of different size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature;
step 553 is similar to step 541 and will not be described herein.
554. Performing channel shuffling on the twelfth distinguishing feature through a channel shuffling module CSA;
step 554 is similar to step 543, and is not described herein.
555. Distributing attention to areas with different sizes of the twelfth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the twelfth distinguishing feature through the distributed attention to generate a thirteenth distinguishing feature;
step 555 is similar to step 541 and will not be described herein.
556. Distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, discarding the channel with the later attention ranking, and generating a fourteenth distinguishing feature;
step 556 is similar to step 546, and will not be described herein.
557. Distributing attention to areas with different sizes of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature;
step 557 is similar to step 541 and will not be described herein.
558. Extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data;
step 558 is similar to step 547 and is not described herein.
559. Inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value;
the terminal inputs the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value, namely, the probability that the real defect image belongs to the defect type A is calculated through the softmax function module.
560. Inputting the simulated defect image into a category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module;
and the terminal inputs the simulated defect image into a category discriminator and outputs a second category discrimination value through a category discrimination module and a softmax function module. The method is similar to the real defect image, and is not described herein.
561. Calculating a first loss value of a generator, a second loss value of a true-false discriminator and a third loss value of a category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value;
562. judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
563. if the conditions are met, determining that the training of the convolutional neural network model is finished;
564. if the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value.
In this embodiment, steps 561 to 564 are similar to steps 105 to 108, and are not described herein.
In this embodiment, a convolutional neural network model and a defect type label are first obtained, where the convolutional neural network model includes a generator, a true-false discriminator and a category discriminator, and the defect type label is a defect label of a target generated image. The terminal obtains the defect label characteristics of the real defect image through the encoder, generates a mean value set and a variance set through defect label characteristic calculation, generates the normal distribution sampling data through sampling the mean value set and the variance set through the re-parameterization technology, and generates the normal distribution sampling data which follows the conditional probability distribution of the real defect image.
Carrying out convolution operation on the normal distribution sampling data through a first convolution module to generate a first sampling characteristic; performing convolution operation on the defect type label through a second convolution module to generate a first label characteristic; performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature; multiplying the first sampling characteristic and the first intermediate characteristic correspondingly according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic; performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module; generating a channel vector for the second intermediate feature through the Attention module Attention of the first channel; outputting a normalized one-dimensional vector with the same dimension as the second intermediate characteristic channel number by combining the Attention of the first channel module with the channel vector; correspondingly multiplying the second intermediate features according to the channels through a first channel Attention module Attention according to the normalized one-dimensional vector to generate third intermediate features; performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module; generating a channel vector for the third intermediate feature by a second channel Attention module Attention; outputting a normalized one-dimensional vector with the same dimension as the third intermediate characteristic channel number by combining the Attention module Attention of the second channel with the channel vector; correspondingly multiplying the third intermediate features according to the channels by a second channel Attention module Attention according to the normalized one-dimensional vector to generate fourth intermediate features; channel superposition is carried out on the fourth intermediate feature, the first label feature and the first sampling feature, and a fifth intermediate feature is generated; performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature; distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature; performing characteristic length and width reconstruction on the seventh intermediate characteristic through a second deconvolution module to generate a first generation parameter; channel superposition is carried out on the first generation parameters and the first sampling characteristics to generate eighth intermediate parameters; performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature; correspondingly multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module RPA according to channels to generate a tenth intermediate feature; performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module; generating a channel vector for the tenth intermediate feature through a third channel Attention module Attention; outputting a normalized one-dimensional vector with the same dimensionality as the tenth middle characteristic channel number by combining the Attention module Attention of the third channel with the channel vector; correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature; performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module; generating a channel vector for the eleventh intermediate feature through a fourth channel Attention module Attention; outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining the Attention module Attention with the channel vector; correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature; channel superposition is carried out on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature; performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature; distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature; performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter; inputting the N-1 generation parameter, the first sampling characteristic and the first tag characteristic into an Nth generation module Generator to generate an Nth generation parameter; and restoring and outputting the Nth generation parameter through an output module Conv _ out to generate a simulated defect image.
Performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature; correspondingly multiplying the real defect image and the first discrimination feature by a regional pixel attention module (RPA) according to a channel to generate a second discrimination feature; performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
distributing attention to different areas of the third distinguishing feature through convolution cores of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the third distinguishing feature through the distributed attention to generate a fourth distinguishing feature; distributing attention to each neuron corresponding to the fourth distinguishing feature through a third attention Dropout module ADO, and setting the neuron with the attention smaller than a third preset threshold to zero to generate a fifth distinguishing feature; performing channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA; distributing attention to different regions of the fifth distinguishing feature through convolution kernels of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature; distributing attention to each neuron corresponding to the sixth discrimination feature through a fourth attention Dropot module ADO, and setting the neurons with the attention smaller than a first preset threshold to zero to generate a seventh discrimination feature; distributing attention to each channel of the seventh distinguishing features through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating eighth distinguishing features; extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data; inputting the true and false discrimination data into a next true and false discrimination module in the true and false discriminator until the SigMoid function module analyzes the true and false discrimination data output by the last true and false discrimination module to obtain a first true and false discrimination value; and inputting the simulated defect image into a true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
Performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature; correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature; performing channel superposition on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature; distributing attention to different regions of the eleventh distinguishing feature through convolution cores of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature; performing channel shuffling on the twelfth distinguishing characteristic through a channel shuffling module CSA; distributing attention to areas with different sizes of the twelfth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the twelfth distinguishing feature through the distributed attention to generate a thirteenth distinguishing feature; distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, discarding the channel with the later attention ranking, and generating a fourteenth distinguishing feature; distributing attention to different regions of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in a region channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature; extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data; inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value; and inputting the simulated defect image into a category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module.
And calculating a first loss value of the generator, a second loss value of the true-false discriminator and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value. And judging whether the first loss value, the second loss value and the third loss value meet preset conditions. And if the conditions are met, determining that the training of the convolutional neural network model is finished. If the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value. In this embodiment, the generator generates a simulated defect image according to the type label and the normal distribution sampling data, and then performs true-false discrimination and type discrimination on the corresponding real defect image and simulated defect image, respectively, to reversely update the generator with the obtained generator loss value (first loss value), the true-false discriminator loss value (second loss value), and the category discriminator loss value (third loss value), and the true-false discriminator and the category discriminator use the true-false discriminator loss value and the category discriminator loss value to update, respectively. After training is completed, the generator can only use normal distribution sampling data and the defect type label to generate images, the quality of the generated images can simultaneously reach the discrimination standards of a true discriminator, a false discriminator and a type discriminator, and the acquisition efficiency and the image quality of the defect images are improved to the greatest extent.
And secondly, because the normal distribution sampling data is from the real defect image, the normal distribution sampling data can be better combined with the defect type label to generate a simulated defect image, the training efficiency is improved, and the training time is reduced.
Referring to fig. 6, the present application provides an embodiment of a training apparatus for generating a model of a defect data set, including:
a first obtaining unit 601, configured to obtain a convolutional neural network model and a defect type label, where the convolutional neural network model includes a generator, a true/false discriminator, and a category discriminator;
a first generating unit 602, configured to input a set of normal distribution sampling data and a defect type label into a generator, and generate a simulated defect image;
a second generating unit 603, configured to input the real defect image and the simulated defect image into the true-false discriminator, and generate a first true-false discrimination value of the real defect image and a second true-false discrimination value of the simulated defect image;
a third generating unit 604, configured to input the real defect image and the simulated defect image into the category discriminator, and generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
a calculating unit 605, configured to calculate a first loss value of the generator, a second loss value of the true-false discriminator, and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value, and the second category discrimination value;
a determining unit 606, configured to determine whether the first loss value, the second loss value, and the third loss value satisfy a preset condition;
the determining unit 607 is configured to determine that the training of the convolutional neural network model is completed when the determining unit determines that the first loss value, the second loss value, and the third loss value satisfy the condition;
an updating unit 608, configured to update the weight of the generator according to the first loss value, the second loss value, and the third loss value when the determining unit determines that the first loss value, the second loss value, and the third loss value do not satisfy the condition, and update the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value, respectively.
Referring to fig. 7, the present application provides an embodiment of a training apparatus for generating a model of a defect data set, including:
a first obtaining unit 701, configured to obtain a convolutional neural network model and a defect type label, where the convolutional neural network model includes a generator, a true-false discriminator, and a category discriminator;
a second obtaining unit 702, configured to obtain, by an encoder, a defect label characteristic of a real defect image;
a fourth generating unit 703, configured to generate a mean set and a variance set through defect label feature calculation, where the hidden space parameter is conditional probability distribution of a real defect image;
a fifth generating unit 704, configured to sample the mean set and the variance set by using a reparameterization technique, and generate normal distribution sampling data, where the normal distribution sampling data follows a conditional probability distribution of a real defect image;
a first generating unit 705, configured to input a set of normal distribution sampling data and a defect type label into the generator, and generate a simulated defect image;
optionally, the Generator includes N generation modules, generators and an output module Conv _ out, where N is an integer greater than or equal to 2;
the first generation unit 705 includes:
the first generation module 7051 is configured to perform convolution operation on the normal distribution sampling data through the first convolution module to generate a first sampling characteristic;
a second generating module 7052, configured to perform convolution operation on the defect type tag through the second convolution module to generate a first tag feature;
a third generating module 7053 for inputting the first sampling characteristic and the first tag characteristic into the first generating module Generator to generate the first generating parameter.
Optionally, the generation module Generator includes a region pixel Attention module RPA, an Attention Dropout module ADO, at least two channel Attention modules Attention, a deconvolution module, and a convolution module;
the third generating module 7053 specifically is:
performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature;
multiplying the first sampling characteristic and the first intermediate characteristic correspondingly according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic;
performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module;
generating a channel vector for the second intermediate feature by a first channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimension as the second intermediate characteristic channel number by combining the Attention of the first channel module with the channel vector;
the second intermediate features are multiplied correspondingly according to the channels through the Attention module Attention of the first channel according to the normalized one-dimensional vector, and third intermediate features are generated;
performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module;
generating a channel vector for the third intermediate feature by a second channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimension as the third intermediate characteristic channel number by combining the Attention module Attention of the second channel with the channel vector;
correspondingly multiplying the third intermediate features according to the channels by a second channel Attention module Attention according to the normalized one-dimensional vector to generate fourth intermediate features;
channel superposition is carried out on the fourth intermediate feature, the first label feature and the first sampling feature, and a fifth intermediate feature is generated;
performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature;
distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature;
and performing feature length and width reconstruction on the seventh intermediate features through a second deconvolution module to generate first generation parameters.
A fourth generating module 7054, configured to input the first generating parameter, the first sampling characteristic, and the first tag characteristic into the second generating module Generator, so as to generate a second generating parameter;
optionally, the fourth generating module 7054 specifically includes:
channel superposition is carried out on the first generation parameters and the first sampling characteristics to generate eighth intermediate parameters;
performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature;
correspondingly multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module (RPA) according to channels to generate a tenth intermediate feature;
performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module;
generating a channel vector for the tenth intermediate feature through a third channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimension as the tenth middle characteristic channel number by combining the third channel Attention module Attention with the channel vector;
correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature;
performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module;
generating a channel vector for the eleventh intermediate feature through a fourth channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining the fourth channel Attention module Attention with the channel vector;
correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature;
channel superposition is carried out on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature;
performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature;
distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature;
and performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter.
A fifth generating module 7055, configured to input the nth-1 generating parameter, the first sampling feature, and the first tag feature into the nth generating module Generator, so as to generate an nth generating parameter;
a sixth generating module 7056, configured to restore and output the nth generating parameter through the output module Conv _ out, to generate a simulated defect image.
A second generating unit 706, configured to input the real defect image and the simulated defect image into the true-false discriminator, and generate a first true-false discrimination value of the real defect image and a second true-false discrimination value of the simulated defect image;
optionally, the true-false discriminator comprises at least one true-false discriminating module and a SigMoid function module, where one true-false discriminating module includes a region pixel attention module RPA, a region channel attention module SKConv, an attention Dropout module ADO, a channel shuffle module CSA, an attention channel pooling module ACD, and a feature compression module FS;
the second generating unit 706 specifically includes:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature;
multiplying the real defect image and the first discrimination feature by a regional pixel attention module (RPA) according to the channel correspondence to generate a second discrimination feature;
performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
distributing attention to different areas of the third distinguishing feature through convolution cores of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the third distinguishing feature through the distributed attention to generate a fourth distinguishing feature;
distributing attention to each neuron corresponding to the fourth discrimination feature through a third attention Dropot module ADO, and setting the neurons with the attention smaller than a third preset threshold to zero to generate a fifth discrimination feature;
performing channel shuffling on the fifth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to different regions of the fifth distinguishing feature through convolution kernels of different-size receptive fields in the region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature;
distributing attention to each neuron corresponding to the sixth distinguishing feature through a fourth attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh distinguishing feature;
distributing attention to each channel of the seventh distinguishing features through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating eighth distinguishing features;
extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data;
inputting the true and false distinguishing data into the next true and false distinguishing module in the true and false discriminator until the SigMoid function module analyzes the true and false distinguishing data output by the last true and false distinguishing module to obtain a first true and false distinguishing value;
and inputting the simulated defect image into a true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
A third generating unit 707 for inputting the real defect image and the simulated defect image into the category discriminator, and generating a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
optionally, the category discriminator includes at least one category discriminating module and a softmax function module, where the category discriminating module includes a region pixel attention module RPA, a region channel attention module SKConv, a channel shuffling module CSA, an attention channel pooling module ACD, and a feature compression module FS;
the third generation unit 707 includes:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature;
correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature;
performing channel superposition on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature;
distributing attention to different size regions of the eleventh distinguishing feature through convolution kernels of different size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature;
performing channel shuffling on the twelfth distinguishing characteristic through a channel shuffling module CSA;
distributing attention to different regions of the twelfth distinguishing characteristic through convolution kernels of different receptive fields in the region channel attention module SKConv, and screening different characteristic channels of the twelfth distinguishing characteristic through the distributed attention to generate a thirteenth distinguishing characteristic;
distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, discarding the channel with the later attention ranking, and generating a fourteenth distinguishing feature;
distributing attention to areas with different sizes of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature;
extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data;
inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value;
and inputting the simulated defect image into a category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module.
A calculating unit 708, configured to calculate a first loss value of the generator, a second loss value of the true-false discriminator, and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value, and the second category discrimination value;
a determining unit 709, configured to determine whether the first loss value, the second loss value, and the third loss value meet preset conditions;
the determining unit 710 is configured to determine that the training of the convolutional neural network model is completed when the determining unit determines that the first loss value, the second loss value, and the third loss value satisfy the condition;
an updating unit 711, configured to update the weight of the generator according to the first loss value, the second loss value, and the third loss value when the determining unit determines that the first loss value, the second loss value, and the third loss value do not satisfy the condition, and update the weight of the true-false discriminator and the weight of the category discriminator according to the second loss value and the third loss value, respectively.
Referring to fig. 8, the present application provides an electronic device, including:
a processor 801, a memory 803, an input-output unit 802, and a bus 804.
The processor 801 is connected to a memory 803, an input-output unit 802, and a bus 804.
The memory 803 holds a program that the processor 801 calls to perform the training method as in fig. 1, 5-2, 5-3, 5-4, 5-5, 5-6, 5-7, and 5-8.
The present application provides a computer readable storage medium having stored thereon a program which, when executed on a computer, performs a training method as in fig. 1, fig. 5-2, fig. 5-3, fig. 5-4, fig. 5-5, fig. 5-6, fig. 5-7, and fig. 5-8.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, which are essential or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (10)

1. A method for training a generative model of a defect data set, comprising:
acquiring a convolutional neural network model and a defect type label, wherein the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
inputting a group of normal distribution sampling data and the defect type label into the generator to generate a simulated defect image;
inputting a real defect image and the simulated defect image into the true and false discriminator to generate a first true and false discrimination value of the real defect image and a second true and false discrimination value of the simulated defect image;
inputting the real defect image and the simulated defect image into the category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
calculating a first loss value of the generator, a second loss value of the true-false discriminator and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value and the second category discrimination value;
judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
if the conditions are met, determining that the convolutional neural network model is trained completely;
if the condition is not met, updating the weight of the generator according to the first loss value, the second loss value and the third loss value, and respectively updating the true and false discriminator and the category discriminator weight according to the second loss value and the third loss value.
2. Training method according to claim 1, wherein the Generator comprises N generation modules Generator and an output module Conv _ out, N being an integer greater than or equal to 2;
inputting a set of normal distribution sampling data and the defect type label into the generator to generate a simulated defect image, including:
performing convolution operation on the normal distribution sampling data through a first convolution module to generate a first sampling characteristic;
performing convolution operation on the defect type label through a second convolution module to generate a first label characteristic;
inputting the first sampling characteristic and the first tag characteristic into a first generation module Generator to generate a first generation parameter;
inputting the first generation parameter, the first sampling characteristic and the first tag characteristic into a second generation module Generator to generate a second generation parameter;
inputting the N-1 generation parameter, the first sampling characteristic and the first tag characteristic into an Nth generation module Generator to generate an Nth generation parameter;
and performing reduction output on the Nth generation parameter through an output module Conv _ out to generate a simulated defect image.
3. Training method according to claim 2, characterized in that the generation module Generator comprises a region pixel Attention module RPA, an Attention Dropout module ADO, at least two channel Attention modules Attention, a deconvolution module and a convolution module;
inputting the first sampling feature and the first tag feature into a first generation module Generator to generate a first generation parameter, specifically:
performing regional pixel value weight generation processing on the first sampling feature through a regional pixel attention module (RPA) to generate a first intermediate feature;
multiplying the first sampling characteristic and the first intermediate characteristic correspondingly according to channels through a regional pixel attention module (RPA) to generate a second intermediate characteristic;
performing convolution processing and channel superposition processing on the second intermediate features through a third convolution module;
generating a channel vector for the second intermediate feature by a first channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the second intermediate characteristic channel number by combining a channel vector through a first channel Attention module Attention;
correspondingly multiplying the second intermediate features according to the channels by a first channel Attention module Attention according to the normalized one-dimensional vector to generate third intermediate features;
performing convolution processing and channel superposition processing on the third intermediate features through a fourth convolution module;
generating a channel vector for the third intermediate feature by a second channel Attention module Attention;
outputting a normalized one-dimensional vector with the dimension same as the third intermediate characteristic channel number by combining a second channel Attention module Attention with a channel vector;
correspondingly multiplying the third intermediate features by channels according to the normalized one-dimensional vector through a second channel Attention module Attention to generate fourth intermediate features;
performing channel superposition on the fourth intermediate feature, the first label feature and the first sampling feature to generate a fifth intermediate feature;
performing feature length and width reconstruction on the fifth intermediate feature through a first deconvolution module to generate a sixth intermediate feature;
distributing attention to each neuron corresponding to the sixth intermediate feature through a first attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold to zero to generate a seventh intermediate feature;
and performing characteristic length and width reconstruction on the seventh intermediate characteristic through a second deconvolution module to generate a first generation parameter.
4. Training method according to claim 2, wherein said inputting said first generation parameter, said first sampling feature and said first label feature into a second generation module Generator generates a second generation parameter, in particular:
channel superposition is carried out on the first generation parameters and the first sampling characteristics, and eighth intermediate parameters are generated;
performing regional pixel value weight generation processing on the eighth intermediate parameter through a regional pixel attention module (RPA) to generate a ninth intermediate feature;
multiplying the eighth intermediate parameter and the ninth intermediate feature by a regional pixel attention module (RPA) according to the channel correspondence to generate a tenth intermediate feature;
performing convolution processing and channel superposition processing on the tenth intermediate feature through a fifth convolution module;
generating a channel vector for the tenth intermediate feature by a third channel Attention module Attention;
outputting a normalized one-dimensional vector with the dimension same as the tenth intermediate characteristic channel number by combining a third channel Attention module Attention with a channel vector;
correspondingly multiplying the tenth intermediate feature by the third channel Attention module Attention according to the normalized one-dimensional vector to generate an eleventh intermediate feature;
performing convolution processing and channel superposition processing on the eleventh intermediate feature through a sixth convolution module;
generating a channel vector for the eleventh intermediate feature by a fourth channel Attention module Attention;
outputting a normalized one-dimensional vector with the same dimensionality as the eleventh intermediate characteristic channel number by combining a fourth channel Attention module Attention with a channel vector;
correspondingly multiplying the eleventh intermediate feature by the fourth channel Attention module Attention according to the normalized one-dimensional vector to generate a twelfth intermediate feature;
performing channel superposition on the twelfth intermediate feature, the first label feature and the eighth intermediate parameter to generate a thirteenth intermediate feature;
performing feature length and width reconstruction on the thirteenth intermediate feature through a third deconvolution module to generate a fourteenth intermediate feature;
distributing attention to each neuron corresponding to the fourteenth intermediate feature through a second attention Dropout module ADO, and setting the neuron with the attention smaller than a second preset threshold to zero to generate a fifteenth intermediate feature;
and performing feature length and width reconstruction on the fifteenth intermediate feature through a fourth deconvolution module to generate a second generation parameter.
5. Training method according to claim 2, wherein the true-false discriminator consists of at least one true-false discriminating module and a SigMoid function module, one of the true-false discriminating modules comprising a region pixel attention module RPA, a region channel attention module SKConv, an attention Dropout module ADO, a channel shuffle module CSA, an attention channel pooling module ACD and a feature compression module FS;
the inputting the real defect image and the simulated defect image into the true-false discriminator to generate a first true-false discrimination value of the real defect image and a second true-false discrimination value of the simulated defect image includes:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a first discriminant feature;
multiplying the real defect image and the first discrimination feature correspondingly according to channels by a regional pixel attention module (RPA) to generate a second discrimination feature;
performing channel superposition on the second discrimination feature and the first label feature to generate a third discrimination feature;
distributing attention to different regions of the third distinguishing characteristic through convolution kernels of different-size receptive fields in a region channel attention module SKConv, and screening different characteristic channels of the third distinguishing characteristic through the distributed attention to generate a fourth distinguishing characteristic;
distributing attention to each neuron corresponding to the fourth discriminant feature through a third attention Dropout module ADO, and setting the neuron with the attention smaller than a third preset threshold to zero to generate a fifth discriminant feature;
performing channel shuffling on the fifth discrimination feature by the channel shuffling module CSA;
distributing attention to different regions of the fifth distinguishing feature through convolution cores of different-size receptive fields in a region channel attention module SKConv, and screening different feature channels of the fifth distinguishing feature through the distributed attention to generate a sixth distinguishing feature;
distributing attention to each neuron corresponding to the sixth distinguishing feature through a fourth attention Dropout module ADO, and setting the neuron with the attention smaller than a first preset threshold value to zero to generate a seventh distinguishing feature;
distributing attention to each channel of the seventh distinguishing feature through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating an eighth distinguishing feature;
extracting feature information of the eighth distinguishing feature through a feature compression module FS to generate true and false distinguishing data;
inputting the true and false discrimination data into a next true and false discrimination module in the true and false discriminator until a SigMoid function module analyzes the true and false discrimination data output by the last true and false discrimination module to obtain a first true and false discrimination value;
and inputting the simulated defect image into the true and false discriminator, and outputting a second true and false discrimination value through a true and false discrimination module and a SigMoid function module.
6. Training method according to claim 2, characterized in that the class discriminator comprises at least one class discrimination module and a softmax function module, the class discrimination modules comprising a region pixel attention module RPA, a region channel attention module SKConv, a channel shuffle module CSA, an attention channel pooling module ACD and a feature compression module FS;
the inputting the real defect image and the simulated defect image into the category discriminator to generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image includes:
performing regional pixel value weight generation processing on the real defect image through a regional pixel attention module (RPA) to generate a ninth discriminant feature;
correspondingly multiplying the real defect image and the ninth discriminant feature by a regional pixel attention module (RPA) according to channels to generate a tenth discriminant feature;
performing channel superposition on the tenth distinguishing feature and the first label feature to generate an eleventh distinguishing feature;
distributing attention to different size regions of the eleventh distinguishing feature through convolution kernels of different size receptive fields in the region channel attention module SKConv, and screening different feature channels of the eleventh distinguishing feature through the distributed attention to generate a twelfth distinguishing feature;
performing channel shuffling on the twelfth discriminant feature by the channel shuffling module CSA;
distributing attention to areas with different sizes of the twelfth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the twelfth distinguishing feature through the distributed attention to generate a thirteenth distinguishing feature;
distributing attention to each channel of the thirteenth distinguishing feature through an attention channel pooling module ACD, abandoning the channels with the later attention ranking, and generating a fourteenth distinguishing feature;
distributing attention to areas with different sizes of the fourteenth distinguishing feature through convolution kernels of different-size receptive fields in the area channel attention module SKConv, and screening different feature channels of the fourteenth distinguishing feature through the distributed attention to generate a fifteenth distinguishing feature;
extracting feature information of the fifteenth distinguishing feature through a feature compression module FS to generate category distinguishing data;
inputting the category discrimination data into a next category discrimination module in the category discriminator until the softmax function module analyzes the category discrimination data output by the last category discrimination module to obtain a first category discrimination value;
inputting the simulated defect image into the category discriminator, and outputting a second category discrimination value through a category discrimination module and a softmax function module.
7. A training method as claimed in any one of claims 1 to 6, wherein after said obtaining a convolutional neural network model and a defect type label before said inputting a set of normally distributed sample data and said defect type label into said generator to generate a simulated defect image, said training method further comprises:
acquiring defect label characteristics of a real defect image through an encoder;
calculating and generating a mean set and a variance set through the defect label characteristics, wherein hidden space parameters are conditional probability distribution of the real defect image;
and sampling the mean set and the variance set by a re-parameterization technology to generate normal distribution sampling data, wherein the normal distribution sampling data follow the conditional probability distribution of a real defect image.
8. A training apparatus for generative models of defect data sets, comprising:
the system comprises a first obtaining unit, a second obtaining unit and a defect type label, wherein the first obtaining unit is used for obtaining a convolutional neural network model and a defect type label, and the convolutional neural network model comprises a generator, a true and false discriminator and a category discriminator;
the first generation unit is used for inputting a group of normal distribution sampling data and the defect type label into the generator to generate a simulated defect image;
a second generating unit, configured to input a real defect image and the simulated defect image into the true/false discriminator, and generate a first true/false discrimination value of the real defect image and a second true/false discrimination value of the simulated defect image;
a third generating unit, configured to input the real defect image and the simulated defect image into the category discriminator, and generate a first category discrimination value of the real defect image and a second category discrimination value of the simulated defect image;
a calculating unit, configured to calculate a first loss value of the generator, a second loss value of the true-false discriminator, and a third loss value of the category discriminator according to the first true-false discrimination value, the second true-false discrimination value, the first category discrimination value, and the second category discrimination value;
the judging unit is used for judging whether the first loss value, the second loss value and the third loss value meet preset conditions or not;
the determining unit is used for determining that the convolutional neural network model training is finished when the judging unit determines that the first loss value, the second loss value and the third loss value meet the conditions;
and the updating unit is used for updating the weight of the generator according to the first loss value, the second loss value and the third loss value and respectively updating the true and false discriminator weight and the category discriminator weight according to the second loss value and the third loss value when the judging unit determines that the first loss value, the second loss value and the third loss value do not meet the condition.
9. An electronic device, comprising:
the device comprises a processor, a memory, an input and output unit and a bus;
the processor is connected with the memory, the input and output unit and the bus;
the memory holds a program that the processor calls to perform the training method of any one of claims 1 to 7.
10. A computer-readable storage medium, having a program stored thereon, which when executed on a computer performs the training method of any one of claims 1 to 7.
CN202211498083.2A 2022-11-28 2022-11-28 Training method and related device for defect data set generation model Active CN115526891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211498083.2A CN115526891B (en) 2022-11-28 2022-11-28 Training method and related device for defect data set generation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211498083.2A CN115526891B (en) 2022-11-28 2022-11-28 Training method and related device for defect data set generation model

Publications (2)

Publication Number Publication Date
CN115526891A true CN115526891A (en) 2022-12-27
CN115526891B CN115526891B (en) 2023-04-07

Family

ID=84705265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211498083.2A Active CN115526891B (en) 2022-11-28 2022-11-28 Training method and related device for defect data set generation model

Country Status (1)

Country Link
CN (1) CN115526891B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115860113A (en) * 2023-03-03 2023-03-28 深圳精智达技术股份有限公司 Training method and related device for self-antagonistic neural network model
CN117894083A (en) * 2024-03-14 2024-04-16 中电科大数据研究院有限公司 Image recognition method and system based on deep learning
CN118332258A (en) * 2024-04-17 2024-07-12 淄博市数字农业农村发展中心 Abnormal data processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163267A (en) * 2019-05-09 2019-08-23 厦门美图之家科技有限公司 A kind of method that image generates the training method of model and generates image
CN111126446A (en) * 2019-11-29 2020-05-08 西安工程大学 Method for amplifying defect image data of robot vision industrial product
CN112686894A (en) * 2021-03-10 2021-04-20 武汉大学 FPCB (flexible printed circuit board) defect detection method and device based on generative countermeasure network
CN114818501A (en) * 2022-05-07 2022-07-29 襄阳湖北工业大学产业研究院 Light weight method for solar cell defect detection based on data enhancement
CN115393231A (en) * 2022-11-01 2022-11-25 深圳精智达技术股份有限公司 Defect image generation method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163267A (en) * 2019-05-09 2019-08-23 厦门美图之家科技有限公司 A kind of method that image generates the training method of model and generates image
CN111126446A (en) * 2019-11-29 2020-05-08 西安工程大学 Method for amplifying defect image data of robot vision industrial product
CN112686894A (en) * 2021-03-10 2021-04-20 武汉大学 FPCB (flexible printed circuit board) defect detection method and device based on generative countermeasure network
CN114818501A (en) * 2022-05-07 2022-07-29 襄阳湖北工业大学产业研究院 Light weight method for solar cell defect detection based on data enhancement
CN115393231A (en) * 2022-11-01 2022-11-25 深圳精智达技术股份有限公司 Defect image generation method and device, electronic equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115860113A (en) * 2023-03-03 2023-03-28 深圳精智达技术股份有限公司 Training method and related device for self-antagonistic neural network model
CN117894083A (en) * 2024-03-14 2024-04-16 中电科大数据研究院有限公司 Image recognition method and system based on deep learning
CN118332258A (en) * 2024-04-17 2024-07-12 淄博市数字农业农村发展中心 Abnormal data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115526891B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN115526891B (en) Training method and related device for defect data set generation model
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN111784602A (en) Method for generating countermeasure network for image restoration
CN110728219A (en) 3D face generation method based on multi-column multi-scale graph convolution neural network
CN110879982B (en) Crowd counting system and method
CN115393231B (en) Defect image generation method and device, electronic equipment and storage medium
Jiang et al. Learning a referenceless stereopair quality engine with deep nonnegativity constrained sparse autoencoder
CN111275638A (en) Face restoration method for generating confrontation network based on multi-channel attention selection
CN114581560A (en) Attention mechanism-based multi-scale neural network infrared image colorizing method
Zheng et al. T-net: Deep stacked scale-iteration network for image dehazing
CN113112416A (en) Semantic-guided face image restoration method
CN111882516B (en) Image quality evaluation method based on visual saliency and deep neural network
CN113284061A (en) Underwater image enhancement method based on gradient network
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN112348762A (en) Single image rain removing method for generating confrontation network based on multi-scale fusion
CN114021704B (en) AI neural network model training method and related device
CN116543433A (en) Mask wearing detection method and device based on improved YOLOv7 model
CN113781375B (en) Vehicle-mounted vision enhancement method based on multi-exposure fusion
CN115018711A (en) Image super-resolution reconstruction method for warehouse scheduling
CN114881879A (en) Underwater image enhancement method based on brightness compensation residual error network
CN113159158B (en) License plate correction and reconstruction method and system based on generation countermeasure network
Qiao et al. Uie-fsmc: Underwater image enhancement based on few-shot learning and multi-color space
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN117834852A (en) Space-time video quality evaluation method based on cross-attention multi-scale visual transformer
Guan et al. DiffWater: Underwater image enhancement based on conditional denoising diffusion probabilistic model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant