CN114462558A

CN114462558A - Data-augmented supervised learning image defect classification method and system

Info

Publication number: CN114462558A
Application number: CN202210382999.5A
Authority: CN
Inventors: 郭波; 张渴望; 张建; 谢云敏
Original assignee: Nanchang Institute of Technology
Current assignee: Nanchang Institute of Technology
Priority date: 2022-04-13
Filing date: 2022-04-13
Publication date: 2022-05-10

Abstract

The invention provides a method and a system for classifying defects of supervised learning images with data augmentation, wherein the method comprises the following steps: acquiring an image to be trained; inputting the image to be trained into an interested region feature extraction module to obtain an image interested feature region; constructing a data augmentation model, and performing data augmentation on the image interested characteristic region to obtain a data set after data augmentation; constructing a supervised learning neural network model, and training the supervised learning neural network model by using the data set after data augmentation; and (4) putting the image of the area to be predicted into the trained supervised learning neural network model for prediction to obtain an image classification result. The method has the advantages of small requirement on manual marking, good classification and identification performance, higher robustness and strong expandability.

Description

Data-augmented supervised learning image defect classification method and system

Technical Field

The invention relates to the technical field of computer image defect classification, in particular to a method and a system for classifying image defects through data augmentation and supervised learning.

Background

At present, for a deep learning image classification detection technology with better performance, large-scale manual labeling is often required to meet the training requirement of a high-precision model. However, manually annotating large amounts of data, resulting in long annotation times and inefficiencies, is impractical for certain application scenarios. In the traditional method for manually detecting the defects of the products, because workers work for a long time, the labor intensity is high, visual fatigue is easy to generate, and the defect detection quality is not uniform.

Deep learning based classification of defect images is a common defect detection technique for automatically identifying and detecting image defects. However, the accuracy of defect detection is often poor due to the fact that the defect image data sets required for neural network training are few or it is difficult to obtain sufficient defect data sets.

Disclosure of Invention

In view of the above-mentioned situation, the main objective of the present invention is to provide a method and a system for classifying defects in supervised learning images with augmented data to solve the above-mentioned technical problems.

The embodiment of the invention provides a data-augmented supervised learning image defect classification method, which comprises the following steps:

step one, obtaining an image to be trained;

inputting the image to be trained into an interested region feature extraction module to obtain an image interested feature region;

constructing a data augmentation model, and performing data augmentation on the image interested characteristic region to obtain a data set after data augmentation;

constructing a supervised learning neural network model, and training the supervised learning neural network model by using the data set after data augmentation;

and fifthly, placing the image of the area to be predicted into the trained supervised learning neural network model for prediction to obtain an image classification result.

The invention provides a supervised learning image defect classification method for data augmentation, which comprises the steps of firstly, obtaining an image to be trained; inputting an image to be trained into an interested region feature extraction module to obtain an image interested feature region, then constructing a data augmentation model, and performing data augmentation on the image interested feature region to obtain a data set after data augmentation; then, a supervised learning neural network model is constructed, and the supervised learning neural network model is trained by utilizing the data set after data augmentation; and finally, the image of the area to be predicted is put into the trained supervised learning neural network model for prediction to obtain an image classification result. After the defect images are extracted through the region-of-interest feature extraction module, the calculated amount can be reduced by utilizing the data augmentation neural network, and the problems of insufficient quantity and unbalance of different defect images are solved; in addition, the supervised learning neural network model increases the image negative value characteristic information and reduces the number of corresponding neural network layers, so that the fitting speed can be increased and the training time can be shortened. The method has the advantages of small requirement on manual marking, good classification and identification performance, higher robustness and strong expandability and data augmentation.

The method for classifying the defects of the supervised learning image with the augmented data comprises the following steps of:

inputting the image to be trained into an interested region feature extraction module for interested extraction to obtain a defect image;

and carrying out normalization processing on the defect image to obtain the interested characteristic region of the image.

In the third step, the method for constructing the data augmentation model and performing data augmentation on the feature region of interest of the image includes the following steps:

inputting the image interesting feature region into a generator model, and performing up-sampling on the image interesting feature region through a transposed convolution layer in the generator model to obtain an up-sampling defect image, wherein the generator model comprises the transposed convolution layer, a batch normalization layer and an activation function layer which are sequentially connected;

inputting the up-sampling defect image into a batch normalization layer for normalization processing to obtain a normalized up-sampling defect image;

performing function mapping on the normalized up-sampling defect image through the activation function layer to output a generated feature image from the generator model;

inputting the generated characteristic image into a discrimination model, and judging whether the generated characteristic image is consistent with a real defect image;

and if the generated characteristic image is consistent with the real defect image, outputting the generated characteristic image from the discrimination model.

The method for classifying the defects of the supervised learning image with the augmented data comprises the following steps of inputting the upsampled defect image into a batch normalization layer for normalization, wherein a corresponding normalization formula is as follows:

wherein the content of the first and second substances,

indicates input to

Layer one

The mean of the neuron numbers of individual neurons,

is shown as

Layer one

The magnitude of the value of each of the neurons,

representing batches of training data neurons

The standard deviation of the degree of activation of (a),

the number of the neuron is indicated by a sequence number,

is shown as

Layer one

Magnitude of normalized values for individual neurons.

In the method for classifying the defects of the supervised learning image with the augmented data, the normalized upsampled defect image is subjected to function mapping through the activation function layer, so that a generated feature image is output from the generator model, and a corresponding activation function is represented as:

wherein the content of the first and second substances,

a first activation function is represented that is,

to representA certain neuron node value in a specific layer of the neural network model,

a random number representing a normal distribution of input neurons,

the variance is expressed in terms of the number of peaks,

representing the number of samples;

the discriminant formula corresponding to the discriminant model is expressed as:

wherein the content of the first and second substances,

representing minimizing the discriminant probability of generating a sample for the generator model,

representing the maximum discriminant probability of the generated samples for the discriminant model,

represents the cross-entropy loss of the binary function,

the expected value of the true sample is represented,

which represents the expected value of the generated sample,

representing the probability of whether the real defect image is real or not,

representation generator modelProbability of whether the generated feature image of the profile output is true,

a generated feature image representing the output of the generator model,

representing the noise of the input generator model.

inputting the data set after data augmentation into a first layer in a supervised learning neural network model for convolution to obtain a second layer of image features and a third layer of image features;

fusing the second layer image features with the third layer image features to obtain fourth layer fused image features, and pooling the fourth layer fused image features to obtain fifth layer image features;

and performing convolution on the fifth layer image characteristics and the sixth layer image characteristics, pooling the seventh layer image characteristics, and finally completing the training of the supervised learning neural network model through a one-dimensional operation.

The method for classifying the defects of the supervised learning image with the augmented data comprises the following steps of inputting a data set with the augmented data into a first layer in a supervised learning neural network model for convolution to obtain a second layer image feature and a third layer image feature, wherein a formula for performing convolution operation is as follows:

wherein the content of the first and second substances,

representing position in input function image

The gray-level value at the location of the location,

representing a convolution kernel

The magnitude of the value at the location of the position,

representing the abscissa in the input function image,

representing the ordinate in the input function image,

representing the abscissa corresponding to the convolution kernel,

representing the ordinate corresponding to the convolution kernel;

in the step of fusing the second-layer image features and the third-layer image features to obtain fourth-layer fused image features, a formula for performing feature fusion is represented as:

wherein the content of the first and second substances,

representing the fused image characteristics after the fusion of the cascade function,

which represents a vector concatenation operation, is shown,

a second activation function is represented that is,

feature information representing a convolution image output from a second layer in the supervised learning neural network model,

feature information representing the convolution image output by the third layer in the supervised learning neural network model,

representing a global average pooling operation.

The supervised learning neural network model comprises a plurality of supervised learning neural network layers, and the calculation formula of each neuron in the supervised learning neural network layers is expressed as:

wherein the content of the first and second substances,

is shown as

Layer one

The magnitude of the value of each of the neurons,

is shown as

In a layer of

Of individual neuronThe size of the numerical value is,

is shown as

In a layer of

The nerve cell and the first

The weights are connected to the layer neurons,

which represents a non-linear activation function,

is shown as

The total number of neurons in a layer.

The method for classifying the defects of the supervised learning image with the augmented data comprises the following steps of training a supervised learning neural network model, wherein the training of the supervised learning neural network model comprises forward propagation and backward propagation, the forward propagation of the supervised learning neural network model is completed through a convolutional layer and a pooling layer, the supervised learning neural network model carries out multi-class cross entropy function calculation through the backward propagation, and the expression of the multi-class cross entropy function is as follows:

wherein the content of the first and second substances,

represents a multi-class cross-entropy function,

representing a sample

The cross-entropy function of (a) is,

indicating the number of categories that are required,

has a value of 0 or 1; when the sample is

The true class of

Taking 1, otherwise, taking 0;

representing a sample

Belong to the category

The prediction function of (a) is determined,

representing the number of samples;

the formula of the weight update corresponding to the multi-classification cross entropy function is represented as:

wherein the content of the first and second substances,

which represents the weight update learning rate,

indicating the updated weight value of the weight value,

which represents the current weight value of the current weight,

representing a sample

The prediction function of (2):

weight update learning rate

Expressed as:

。

the invention also provides a system for classifying the defects of the supervised learning image with augmented data, wherein the system comprises:

the image acquisition module is used for acquiring an image to be trained;

the interest extraction module is used for inputting the image to be trained into the interest region feature extraction module to obtain an image interest feature region;

the data augmentation module is used for constructing a data augmentation model and augmenting the interested characteristic region of the image to obtain a data set after data augmentation;

the model training module is used for constructing a supervised learning neural network model and training the supervised learning neural network model by using the data set after data augmentation;

and the result output module is used for putting the image of the area to be predicted into the trained supervised learning neural network model for prediction so as to obtain an image classification result.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a flowchart illustrating a method for classifying defects in supervised learning images with augmented data according to a first embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for classifying defects in supervised learning images with augmented data according to a second embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a supervised learning neural network model according to a second embodiment of the present invention;

FIG. 4 is a diagram illustrating a comparison of the structure of a supervised learning neural network model, AlexNet and VGG16 neural network in an embodiment of the present invention;

FIG. 5 is a graph comparing the accuracy of a training with data augmentation and the accuracy of a test without data augmentation according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a system for classifying defects in supervised learning images with augmented data according to a third embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

These and other aspects of embodiments of the invention will be apparent with reference to the following description and attached drawings. In the description and drawings, particular embodiments of the invention have been disclosed in detail as being indicative of some of the ways in which the principles of the embodiments of the invention may be practiced, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all changes, modifications and equivalents coming within the spirit and terms of the claims appended hereto.

Referring to fig. 1, the present invention provides a method for classifying defects of supervised learning images with data augmentation, wherein the method includes the following steps:

and S101, acquiring an image to be trained.

Specifically, the image to be trained generally refers to an image captured by a camera, and the image captured by the camera in daily life is generally a color RGB image. The RGB image is composed of three channels R, G, B (red, green, blue), and each channel is set to 256 values, i.e., 0 to 255, according to the range that can be recognized by human eyes.

S102, inputting the image to be trained into an interested region feature extraction module to obtain an image interested feature region.

Specifically, the region of interest (ROI) refers to a region to be processed, which is delineated from a processed image in a manner of a square frame, a circle, an ellipse, an irregular polygon, or the like in machine vision and image processing, and is also a region where a target object is located. In specific implementation, the region of interest can be manually circled with a specific identifier, such as frame selection; and scanning and matching the original color image by utilizing the template image of the target object so as to find an image area with the similarity degree higher than a threshold value with the template image of the target object in the original color image. The image area is an interested area, so that the interested area is automatically extracted; or the automatic identification of the region of interest of the original color image can be carried out based on the trained neural network model.

In this step, the method for inputting the image to be trained to the interested region feature extraction module to obtain the interested feature region of the image includes the following steps:

s1021, inputting the image to be trained into an interested region feature extraction module for interested extraction to obtain a defect image;

s1022, normalizing the defect image to obtain the interested characteristic region of the image.

S103, constructing a data augmentation model, and performing data augmentation on the image interested characteristic region to obtain a data set after data augmentation.

Machine learning allows for improved generalization performance of models by collecting more trainable data in order to achieve recognition with sufficient accuracy. The GAN generation of the antagonistic neural network is a two-person zero-sum game idea (two-player game), two models of a G (Generator model) network and a D (discrimination model) network are trained simultaneously, and after multiple times of antagonistic adjustment, the two models reach Nash balance (Nash equilibrium). After training, the anti-neural network can efficiently and accurately generate a characteristic image, and the integrity of the characteristics of the defect image is ensured as much as possible. In the field of data augmentation, GAN generation has great advantages for antagonistic neural networks.

In the present invention, the method for generating an input image matrix (image interesting feature region) input into a data augmentation model includes the steps of:

generating a random noise image, wherein the size of the random noise image is equal to that of an actual image, and the size of the random noise image is row col, wherein row represents the number of image lines, and col represents the number of image columns;

combining the random noise image with the actual image, wherein the specific combination mode comprises the following steps: respectively reading pixel gray values at the same positions of the random noise image and the actual image, summing the read pixel gray values and averaging to obtain an average value, putting the average value into an image matrix at a corresponding image position, and ending at an image position (row, col) to finally obtain an input image matrix of the data augmentation model.

The image matrix size is row × col, row represents the number of image rows, and col represents the number of image columns. It should be added that, by the above arrangement, the fitting speed of the data augmentation model can be increased, and the training time of the data augmentation model can be reduced.

Further, the input image matrix generated as described above is input to GAN to generate a G (Generator model) network against a neural network model. The Generator model is built by a five-layer neural network, and the first layer adopts transposed convolution-stridgeConv 2D for up-sampling. Meanwhile, in order to prevent gradient disappearance in the convolution process and accelerate the convergence speed of the model, a batch normalization layer Bacth normalization is added. Then, a full connection layer Dense and the first three layers of Francination-stridgeConv 2D transpose convolution layers, each layer adopts the form of an activation function Leaky Relu to carry out function mapping, and then a characteristic defect image is output from a Generator model. And (2) judging the image by entering a D (discrimination model) network and a real defect image, completing construction of a Discriminator model by a three-layer neural network, similarly adopting a form of a Leaky Relu of each layer of activation function to improve the generalization capability of the model, and outputting an image when the generated characteristic image is judged to be consistent with the defect image (see figure 3).

Specifically, in this step, a data augmentation model is constructed, and a method for augmenting the data of the image interested feature region includes the following steps:

and S1031, inputting the image interesting feature region into a generator model, and performing up-sampling on the image interesting feature region through a transposed convolution layer in the generator model to obtain an up-sampling defect image, wherein the generator model comprises the transposed convolution layer, a batch normalization layer and an activation function layer which are sequentially connected.

S1032, inputting the up-sampling defect image into a batch normalization layer for normalization processing to obtain a normalized up-sampling defect image.

In this step, in order to prevent the gradient from disappearing and accelerate the model convergence speed, a batch normalization layer backnormalization is added for normalization, and the corresponding normalization formula is as follows:

wherein the content of the first and second substances,

indicates input to

Layer one

The mean of the neuron numbers of individual neurons,

denotes the first

Layer one

The magnitude of the value of each of the neurons,

representing batches of training data neurons

The standard deviation of the degree of activation of (a),

the number of the neuron is shown as a number,

denotes the first

Layer one

Magnitude of normalized values for individual neurons.

And S1033, performing function mapping on the normalized up-sampling defect image through the activation function layer to output a generated characteristic image from the generator model.

In this step, the corresponding activation function is represented as:

wherein the content of the first and second substances,

a first activation function is represented that is,

represents a certain neuron node value in a specific layer of the neural network model,

the variance is represented as a function of time,

a random number representing a normal distribution of input neurons,

the variance is represented as a function of time,

representing the number of samples.

S1034, inputting the generated characteristic image into a discrimination model, and judging whether the generated characteristic image is consistent with the real defect image.

Wherein, the discriminant formula corresponding to the discriminant model is expressed as:

wherein the content of the first and second substances,

represents the cross-entropy loss of the binary function,

the expected value of the true sample is represented,

which represents the expected value of the generated sample,

representing the probability of whether the real defect image is real or not,

representing the probability of whether the generated feature image output by the generator model is authentic,

a generated feature image representing the output of the generator model,

representing the noise of the input generator model.

And S1035, if the generated characteristic image is consistent with the real defect image, outputting the generated characteristic image from the discriminant model.

And S104, constructing a supervised learning neural network model, and training the supervised learning neural network model by using the data set after data augmentation.

In this step, a supervised learning neural network model is constructed, and the method for training the supervised learning neural network model by using the data set after data augmentation comprises the following steps:

and S1041, inputting the data set after the data augmentation into a first layer in a supervised learning neural network model for convolution to obtain a second layer image feature and a third layer image feature.

The formula for performing the convolution operation is expressed as:

wherein the content of the first and second substances,

representing position in input function image

The gray-level value at the location of the location,

representing a convolution kernel

The magnitude of the value at the location of the position,

representing the abscissa in the input function image,

representing the ordinate in the input function image,

representing the abscissa corresponding to the convolution kernel,

representing the ordinate to which the convolution kernel corresponds. Note that, in the first layer, the Input data (150 × 3) obtained by amplifying the Input data has a 3-channel 150 × 150 pixel matrix.

S1042, fusing the second layer image features and the third layer image features to obtain fourth layer fused image features, and pooling the fourth layer fused image features to obtain fifth layer image features.

In the fourth layer of the supervised learning neural network model, the formula for feature fusion is expressed as:

wherein the content of the first and second substances,

which represents a vector concatenation operation, is shown,

it is shown that the second activation function is,

feature information representing a convolution image output by the second layer in the supervised learning neural network model,

representing a global average pooling operation.

It should be noted here that for the second activation function described above

The specific expression is as follows:

wherein the content of the first and second substances,

representing a neuron node activation value in a particular layer of the neural network model.

In the invention, a supervised learning neural network model firstly adopts the steps of sliding from left to right and from top to bottom according to Stride and carrying out convolution calculation with a 3 x 3 convolution kernel (filter) on the original information of an input layer of a visual perception area to obtain a mapping output feature map (figure map)）I*gThen comparing the output characteristic diagramI*gTaking the negative value to obtain a characteristic diagram-I*g. Wherein the content of the first and second substances,gin the form of a filter or convolution kernel,Iand outputting the characteristic diagram of the previous layer. In the third layer of the supervised learning neural network model, the neural network will output a feature map in this embodimentI*gAnd get the negative characteristic diagram-I*gAnd transmitting the data to a next-stage neuron, and respectively acting a functional relation on each node, namely adopting the second activation function to map the characteristic information of each node.

And S1043, performing convolution on the fifth layer image characteristic and the sixth layer image characteristic, pooling the seventh layer image characteristic, and finally completing training of the supervised learning neural network model through a one-dimensional operation.

In this embodiment, the supervised learning neural network model includes a plurality of supervised learning neural network layers, and the calculation formula of each neuron in the supervised learning neural network layers is represented as:

wherein the content of the first and second substances,

is shown as

Layer one

The magnitude of the value of each of the neurons,

is shown as

In a layer of

The magnitude of the value of each of the neurons,

is shown as

In a layer of

The nerve cell and the first

The weights are connected to the layer neurons,

which represents a non-linear activation function,

is shown as

The total number of neurons in a layer.

Further, in the present invention, the training of the supervised learning neural network model includes forward propagation and backward propagation. The forward propagation of the supervised learning neural network model is completed through a convolutional layer and a pooling layer, the supervised learning neural network model performs multi-class cross entropy function calculation through backward propagation, and the expression of the multi-class cross entropy function is as follows:

wherein the content of the first and second substances,

represents a multi-class cross-entropy function,

representing a sample

The cross-entropy function of (a) is,

indicating the number of categories that are required,

has a value of 0 or 1; when the sample is

The true class of

Taking 1, otherwise, taking 0;

representing a sample

Belong to the category

The prediction function of (a) is determined,

representing the number of samples.

wherein the content of the first and second substances,

which represents the weight update learning rate,

indicating the updated weight value of the weight value,

which represents the current weight value of the current weight,

representing a sample

The prediction function of (2):

weight update learning rate

Expressed as:

。

it should be added that the Adam optimizer is mainly used for updating the value of the weight update learning rate, but if the parameter value is greater than 1, explosion of model parameter update occurs, and therefore needs to be avoided in the model training process. Meanwhile, too small a parameter value may reduce update efficiency, and thus various conditions need to be defined.

In the present invention, the details of the implementation of step S104 are as follows:

first layer in supervised learning neural network model

Obtaining the second layer in the supervised learning neural network model by convolution after acquiring the image data

Third layer in supervised learning neural network model

：

:Conv2D(3*3*16)-m_tanh-strides=3；

:DeConv2D(3*3*16)-m_tanh-strides=3；

To pair

A layer,

The layers are fused to obtain

:

Add（

，

）；

To pair

Making them pass through a pool to obtain

:

MaxPooling2D（3*3）；

To pair

Performing convolution to obtain

：

Conv2D(3*3*32)-m_tanh-strides=3；

To pair

Performing convolution to obtain

：

Conv2D(3*3*32)-m_tanh-strides=3；

To pair

Making them pass through a pool to obtain

:

MaxPooling2D（3*3）；

To pair

Performing 1-dimensional operation to obtain

:

Flatten(x8)；

To pair

Performing full connection operation to obtain

:

Dense(256)-m_tanh；

To pair

Performing full connection operation to obtain

:

Dense(256)-m_tanh；

To pair

Performing full connection operation to obtain

:

Dense(64)-m_tanh；

To pair

Performing full connection operation to obtain

:

Dense(1)-sigmoid；

Where m _ tanh represents the activation function, sigmoid represents the sigmoid function, strides =3 represents a step size of 3, Conv2D (a × b × c) represents the convolution layer with convolution kernel a × b × c, where a × b represents the convolution kernel size and c represents the number of convolution kernels; DeConv2D (a × b × c) indicates that the convolution kernel is a × b × c, and the convolution layer is overall negative in value, i.e., -Conv2D (a × b × c), where a × b represents the convolution kernel size and c represents the number of convolution kernels; MaxPooling2D (3 × 3) represents the largest pooling layer, where 3 × 3 represents the pooling window; dense (64) represents the fully-connected layer, where 64 is the number of neurons in the fully-connected layer and Flatten () represents a one-dimensional operation.

And S105, placing the image of the area to be predicted into the trained supervised learning neural network model for prediction to obtain an image classification result.

Referring to fig. 2, in a second embodiment of the present invention, the method is described in detail by taking defect classification of the welding image as an example, and specifically includes steps S201 to S204:

s201, acquiring a welding original image.

S202, extracting the welding original image through an interested area feature extraction module to obtain a welding defect area.

The welding defect Region is preprocessed, the welding image obtained by the wide dynamic sensor is utilized, and a Region of interest (ROI), namely the welding defect Region, is extracted.

Specifically, a proper threshold value is set for binarization operation, mean value filtering and opening operation modes are sequentially carried out to serve as the basis of relevance operation, then the gray level relation between a specific pixel point and surrounding pixel points is converted, and finally a welding defect area is extracted, so that the image processing difficulty is reduced, and the computer operation time is reduced. Furthermore, normalization operation is carried out on the extracted images, and further training preparation is carried out on the input supervised learning neural network. Additionally, the normalization operation has already been explained in the first embodiment, and is not described herein again.

And step S203, constructing a data augmentation model, and performing data augmentation on the welding defect area to obtain a welding defect data set after data augmentation.

Referring to fig. 3, when the GAN generates an antagonistic neural network model, and the image matrix is input into the GAN to generate a G (Generator model) network of the antagonistic neural network, the Generator model is constructed by a five-layer neural network, and upsampling is performed using a transposed convolution-gradient 2D. Meanwhile, in order to prevent gradient disappearance and accelerate the model convergence speed, a batch normalization layer Bacth normalization is added. The fully connected layer Dense, and the first three Francionlenv 2D layers are transposed into convolutional layers, each layer taking the form of the Activate function Leaky Relu.

And then, outputting an image with welding defect characteristics from the Generator model, judging the image with the welding defect characteristics in a D (Discrimization model) network, completing the construction of the Discrimizer model by a three-layer neural network, and similarly adopting the form of a Leaky Relu of each layer of activation function to improve the generalization capability of the model. Only when the generated characteristic image is judged to be consistent with the welding defect image, the image is output.

And S204, training the supervised learning neural network model by using the welding defect data set after data augmentation so as to realize image defect classification.

In this embodiment, the supervised learning neural network model is far less deep than two conventional models, compared with AlexNet and VGG 16. When a deep neural network is used, the number of layers of the network is large, and a large number of image elements can be carried, so that more complex data relation mapping can be realized.

Meanwhile, too high number of layers often causes problems of over-fitting, under-fitting, gradient disappearance, gradient explosion, and the like. However, the supervised learning neural network model avoids the disadvantages caused by an excessively high number of layers to a certain extent, the number of all parameters (total parameters) or trainable parameters (trainable parameters) is greatly reduced, the operation amount is reduced, and the accuracy of model defect classification is considered (see fig. 4).

And in the process of constructing the welding defect detection framework, the neural network for effectively classifying the welding defect image is realized. Generally, a feature map obtained by a convolutional neural network usually ignores learning of negative value features, a sigmoid (x) function is used for feature mapping operation, a derivative value of back propagation is calculated quickly by using the formula according to the characteristics of the mapping relation, and the back propagation is realized quickly.

Further, the defect classification effect of the supervised learning neural network model is shown in fig. 5, and as the number of operation cycles increases, the accuracy of data augmentation performed by using the GAN generation antagonistic neural network is greatly improved compared with that without data augmentation, and stable detection accuracy is achieved faster than that without data augmentation. When the operation period is about 40 times, the detection precision of the supervised learning neural network model is greatly improved and is relatively kept stable.

In the invention, the weld image defects are classified by the supervised learning neural network model obtained by pre-training, and the accuracy comparison is carried out by adopting known traditional models such as AlexNet, VGG16 and the like. After the comparison model and the supervised learning neural network model are subjected to fine tuning by adopting 1% of test data after training is finished, the supervised learning neural network model provided by the invention has higher accuracy which reaches 96.15%.

The time required by each neural network model in a single training process is taken as the vertical axis, and the time spent by each training of the model, the AlexNet model and the VGG16 in 50 training processes is plotted into a box chart for comparison. The running time range of the model is much more stable than AlexNet and VGG16 according to the length of the box body. The model takes the shortest time for each training on average and has the most stable time distribution. The most uniform operation time distribution of the model can be seen from the middle bit line.

In order to verify the generalization capability of the model provided by the invention, a handwritten character recognition system (MNIST data set) of the American post office is selected to carry out a model generalization effect test. The data set was formed by collecting data sets of handwritten numerical data of residents of the united states, which contained a training set of 60000 examples and a test set of 10000 examples in total. The fact proves that: the data set is well applied to a supervised learning neural network model and is suitable for deep learning neural network classification. The accuracy of the model is higher classification level, 98.05%, when the MNIST data set is classified and compared by the supervised learning neural network model.

In conclusion, the method can reduce the calculated amount and solve the problems of insufficient quantity and unbalance of different defect images by utilizing the data augmented neural network after the ROI defect images are extracted; the supervised learning neural network model increases image negative value characteristic information, reduces the number of corresponding neural network layers, and can accelerate the fitting rate and reduce the training time; the method has the advantages of small requirement on manual marking, good classification and identification performance, higher robustness and strong expandability and data augmentation.

Referring to fig. 6, a third embodiment of the present invention further provides a system for classifying defects in supervised learning images with augmented data, wherein the system includes:

the image acquisition module is used for acquiring an image to be trained;

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent should be subject to the appended claims.

Claims

1. A method for classifying image defects through data augmentation and supervised learning is characterized by comprising the following steps:

step one, obtaining an image to be trained;

2. The method for classifying image defects through data augmentation and supervised learning as recited in claim 1, wherein in the second step, the method for inputting the image to be trained to the region-of-interest feature extraction module to obtain the image feature-of-interest region comprises the following steps:

3. The method for classifying the image defects through data augmentation and supervised learning as recited in claim 1, wherein in the third step, the method for constructing the data augmentation model and performing data augmentation on the image interest feature region comprises the following steps:

4. The method according to claim 3, wherein the step of inputting the upsampled defect image into a batch normalization layer for normalization processing comprises a corresponding normalization formula:

wherein the content of the first and second substances,

indicates input to the first

Layer one

The mean of the neuron numbers of individual neurons,

denotes the first

Layer one

The magnitude of the value of each of the neurons,

representing batches of training data neurons

The standard deviation of the degree of activation of (a),

the number of the neuron is shown as a number,

is shown as

Layer one

The magnitude of the normalized values of the individual neurons.

5. The method according to claim 4, wherein in the step of performing function mapping on the normalized upsampled defect image through the activation function layer to output the generated feature image from the generator model, the corresponding activation function is represented as:

wherein the content of the first and second substances,

a first activation function is represented that is,

a random number representing a normal distribution of input neurons,

the variance is represented as a function of time,

representing the number of samples;

wherein the content of the first and second substances,

representation-to-discriminant model maximization generation sampleThe probability of the discrimination of the present invention,

represents the cross-entropy loss of the binary function,

a desired value that represents a true sample of,

which represents the expected value of the generated sample,

representing the probability of whether the real defect image is real or not,

a generated feature image representing the output of the generator model,

representing the noise of the input generator model.

6. The method for classifying defects of supervised learning image with augmented data as claimed in claim 5, wherein in the fourth step, a supervised learning neural network model is constructed, and the method for training the supervised learning neural network model by using the data set after augmented data comprises the following steps:

and performing convolution on the fifth-layer image features and the sixth-layer image features, pooling the seventh-layer image features, and finally completing training on the supervised learning neural network model through a one-dimensional operation.

7. The method according to claim 6, wherein in the step of inputting the data-augmented data set into the first layer of the supervised learning neural network model for convolution to obtain the second layer image features and the third layer image features, the formula for performing convolution operation is as follows:

wherein, the first and the second end of the pipe are connected with each other,

representing position in input function image

The gray-level value at the location of the location,

representing a convolution kernel

The magnitude of the value at the location of the position,

representing the abscissa in the input function image,

representing input functionsThe ordinate in the image is that of the image,

representing the abscissa corresponding to the convolution kernel,

representing the ordinate corresponding to the convolution kernel;

wherein the content of the first and second substances,

which represents a vector concatenation operation, is shown,

it is shown that the second activation function is,

representing a global average pooling operation.

8. The method according to claim 7, wherein the supervised learning neural network model comprises a plurality of supervised learning neural network layers, and the calculation formula of each neuron in the supervised learning neural network layers is represented as:

wherein the content of the first and second substances,

denotes the first

Layer one

The magnitude of the value of each of the neurons,

denotes the first

In a layer of

The magnitude of the value of each of the neurons,

is shown as

In a layer of

The nerve cell and the first

The weights are connected to the layer neurons,

which represents a non-linear activation function,

denotes the first

Total number of neurons in a layer.

9. The method of claim 8, wherein the training of the supervised learning neural network model comprises forward propagation and backward propagation, wherein the forward propagation of the supervised learning neural network model is completed by a convolutional layer and a pooling layer, and the supervised learning neural network model performs multi-class cross entropy function calculation by backward propagation, and the expression of the multi-class cross entropy function is as follows: