CN116630301A

CN116630301A - Strip steel surface small target defect detection method and system based on super resolution and YOLOv8

Info

Publication number: CN116630301A
Application number: CN202310739468.1A
Authority: CN
Inventors: 张永平; 沈思洁; 徐森; 郭乃瑄; 孟海涛; 陈朝峰; 邵星
Original assignee: Yancheng Institute of Technology; Yancheng Institute of Technology Technology Transfer Center Co Ltd
Current assignee: Yancheng Institute of Technology; Yancheng Institute of Technology Technology Transfer Center Co Ltd
Priority date: 2023-06-20
Filing date: 2023-06-20
Publication date: 2023-08-22

Abstract

The invention provides a strip steel surface small target defect detection method and system based on super resolution and YOLOv8, wherein the method comprises the following steps: acquiring original training data; preprocessing the original training data based on an SRGAN algorithm; constructing a preprocessed image dataset based on the preprocessed image data and performing image data enhancement; constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm, improving the strip steel surface small target defect detection model by using a normalized Wasserstein distance, and training the improved strip steel surface small target defect detection model by using an enhanced training data set; and obtaining test data and inputting the test data into an improved strip steel surface small target defect detection model to detect defects, so as to obtain a defect detection result. The invention uses SRGAN to preprocess the original image data, thereby improving the image resolution; and then, a new measurement standard NWD is introduced to optimize a strip steel surface small target defect detection model constructed based on YOLOv8, so that the detection accuracy of the small target defect is improved.

Description

Strip steel surface small target defect detection method and system based on super resolution and YOLOv8

Technical Field

The invention relates to the technical field of computer data processing, in particular to a strip steel surface small target defect detection method and system based on super resolution and YOLOv 8.

Background

The strip steel is a coiled steel with the width of less than 1.2 meters and the thickness of less than 8 millimeters, which is produced for meeting the requirements of the production and processing of various metal products, and has wide application in welding, steel ring manufacturing, spacecraft manufacturing, spring piece and blade manufacturing and the like. In high-precision processing, especially in spacecraft manufacturing, the small target defects on the surface of the strip steel often cause serious aerospace accidents. Therefore, a more accurate detection of small target defects on the surface of the strip steel is required. However, most of the methods currently adopt manual detection, and the accuracy and efficiency of the detection of the small target defects are affected because the small target defects are found to be limited by eyes of people and the eye fatigue is easily caused by long-term use.

The prior art CN115953398B proposes a defect recognition method for the surface of strip steel, which utilizes clustering of gray level histograms of gray level images and data points in corresponding neighborhood histograms, and constructs an optimal gray level run matrix; calculating long run advantages and short runs Cheng Youshi according to the gray scales of pixel points in the optimal gray scale run matrix; and judging the defect type of the strip steel according to the long run advantage and the short run Cheng Youshi. However, in the prior art CN115953398B, the defects of the strip steel are directly determined by using the gray level image, and the super resolution processing and the image enhancement processing are not performed on the gray level image to improve the accuracy of the recognition model, so that the resolution and recognition capability and accuracy on the defects of the small target of the strip steel are insufficient. Therefore, there is a need for a method for identifying defects on the surface of a strip with high accuracy.

Disclosure of Invention

The method for detecting the small target defects on the surface of the strip steel based on super resolution and YOLOv8 provided by the embodiment of the invention comprises the following steps:

acquiring original training data;

preprocessing the original training data based on an SRGAN algorithm to obtain preprocessed image data;

constructing a preprocessed image data set based on the preprocessed image data and enhancing the image data to obtain an enhanced training data set;

constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm, improving the strip steel surface small target defect detection model by using a normalized Wasserstein distance, and training the improved strip steel surface small target defect detection model by using an enhanced training data set;

and obtaining test data and inputting the test data into an improved strip steel surface small target defect detection model to detect defects, so as to obtain a defect detection result.

Preferably, preprocessing the original training data based on an SRGAN algorithm to obtain preprocessed image data; comprising the following steps:

constructing and generating an countermeasure network model based on an SRGAN algorithm;

training the generated countermeasure network model by utilizing the historical image data to obtain a trained generated countermeasure network model;

and inputting the original image data into the trained generated countermeasure network model to obtain the preprocessed image data.

Preferably, creating the countermeasure network model based on the SRGAN algorithm includes:

generating a sub-network and an authentication sub-network; the generation sub-network comprises a feature extractor, a plurality of residual blocks and an up-sampling layer; the feature extractor is used for extracting features of the low-resolution image, the residual block is used for enhancing the expression mode, and the up-sampling layer is used for converting the low-resolution image into a super-resolution image; the authentication sub-network comprises a convolutional neural network;

preferably, constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm, improving the strip steel surface small target defect detection model by using a normalized Wasserstein distance, and training the improved strip steel surface small target defect detection model by using enhanced training data; comprising the following steps:

constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm;

obtaining a boundary frame of target detection, wherein the boundary frame comprises a prediction frame and a real frame;

based on the prediction frame and the real frame, obtaining a Wasserstein distance;

normalizing the Wasserstein distance to obtain a normalized Wasserstein distance;

and improving the loss function of the strip steel surface small target defect detection model according to the normalized Wasserstein distance.

Preferably, the strip steel surface small target defect detection model comprises:

the device comprises an input layer, a backbone feature extraction sub-network, a neck feature fusion layer and a detection head;

the backbone characteristic extraction sub-network is used for extracting backbone characteristics of the preprocessed image data, and comprises the following steps:

a basic convolution unit, a residual network unit and a spatial pyramid pooling fusion unit,

the basic convolution unit is used for completing convolution operation, batch normalization and activation operation; the residual network unit is used for extracting the characteristics and connecting the extracted characteristics by using the residual; the space pyramid pooling fusion unit is used for selecting a proper size for output;

the neck feature fusion layer is used for carrying out multi-scale feature fusion on backbone features;

and the detection head is used for carrying out multi-scale target detection on backbone characteristics.

Preferably, the strip steel surface small target defect detection method based on super resolution and YOLOv8 further comprises the following steps:

performing label distribution on the boundary frame based on a target detection label method;

the label distribution of the bounding box based on the target detection label method comprises the following steps:

obtaining a boundary box of target detection;

marking the areas with the same shape and size and the same center of the target boundary frame as positive labels on the image data in the enhanced training data set, and marking the rest areas as negative labels;

And (5) inputting the enhanced training data completed by the label into a strip steel surface small target defect detection model.

constructing a process improvement scheme model, and inputting a defect detection result into the constructed process improvement scheme model to obtain a process improvement table;

the method comprises the steps of constructing a process improvement scheme model, and inputting a defect detection result into the constructed process improvement scheme model to obtain a process improvement table; comprising the following steps:

constructing a process improvement scheme model;

wherein, the technology improvement scheme model includes:

a defect judging unit for judging whether a defect exists;

a defect type unit for judging the defect type in case of defect;

the process improvement scheme library is used for storing process improvement schemes and successful cases corresponding to the process improvement schemes;

the process improvement scheme generating unit is used for searching out a process improvement scheme corresponding to the defect type and a success case corresponding to the process improvement scheme from the process improvement scheme library and generating a process improvement table output after summarizing the process improvement scheme and the success case;

inputting the result of defect detection into a process improvement scheme model;

Based on the result of the defect detection, the process improvement scheme model judges whether a defect exists, if so, judges the defect type and outputs a process improvement table.

Preferably, a preprocessed image dataset is constructed based on the preprocessed image data and image data enhancement is performed to obtain an enhanced training dataset; comprising the following steps:

acquiring preprocessing image data to form a preprocessing image data set;

image data enhancement is carried out on the preprocessed image data set by utilizing a CutMix method, so that a first enhanced image set is obtained;

preprocessing the image dataset using the Mosa ic method according to 4:1, carrying out image data enhancement according to the proportion to obtain a second enhanced image set;

preprocessing the image dataset using the Mosa ic method according to 9:1, carrying out image data enhancement according to the proportion to obtain a third enhanced image set;

the enhanced training data set is combined based on the extracted image data randomly from the first data image set, the second enhanced image set and the third enhanced image set at the set extraction scale.

Preferably, raw training data is acquired; comprising the following steps:

an upper surface camera is arranged above the upper surface of the strip steel, and image data of the upper surface of the strip steel are collected;

a lower surface camera is arranged below the lower surface of the strip steel, and image data of the lower surface of the strip steel are collected;

And (3) based on the same area of the strip steel, carrying out symmetrical connection and fusion on the image data of the upper surface of the strip steel and the image data of the lower surface of the strip steel to obtain the original training data.

The invention also provides a strip steel surface small target defect detection system based on super resolution and YOLOv8, which comprises:

the acquisition module is used for acquiring the original image data;

the preprocessing module is used for preprocessing the image data based on the SRGAN algorithm to obtain preprocessed image data;

the model construction module is used for constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm and improving the strip steel surface small target defect detection model by utilizing a normalized Wasserstein distance;

and the defect detection module is used for inputting the preprocessed image data into the improved strip steel surface small target defect detection model to detect defects.

The invention has the beneficial effects that:

the invention uses SRGAN to preprocess the original image data, thereby improving the image resolution; and then, a new measurement standard NWD is introduced to optimize a strip steel surface small target defect detection model constructed based on YOLOv8, so that the detection accuracy of the small target defect is improved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a schematic diagram of a method for detecting small target defects on the surface of a strip steel based on super resolution and YOLOv8 in an embodiment of the invention;

FIG. 2 is a schematic diagram showing the overall operation of a method for detecting surface defects of a strip steel according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a C3 module according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an ELAN module according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a system for detecting small target defects on a strip steel surface based on super resolution and YOLOv8 in an embodiment of the invention.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.

The embodiment of the invention provides a strip steel surface small target defect detection method based on super resolution and YOLOv8, which is shown in fig. 1 and comprises the following steps:

Step 1: acquiring original training data;

step 2: preprocessing the original training data based on an SRGAN algorithm to obtain preprocessed image data;

step 3: constructing a preprocessed image data set based on the preprocessed image data and enhancing the image data to obtain an enhanced training data set;

step 4: constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm, improving the strip steel surface small target defect detection model by using a normalized Wasserstein distance, and training the improved strip steel surface small target defect detection model by using an enhanced training data set;

step 5: and obtaining test data and inputting the test data into an improved strip steel surface small target defect detection model to detect defects, so as to obtain a defect detection result.

The working principle and the beneficial effects of the technical scheme are as follows:

as shown in fig. 2, the original image data is acquired and converted from a low resolution image to a Super resolution image using an SRGAN algorithm (Super-Resolution Generative Adversarial Network, based on a Super resolution algorithm for generating an countermeasure network image), and the Super resolution image is used as the preprocessing image data. And constructing a strip steel surface small target defect detection model, and improving the strip steel surface small target defect detection model by using the normalized Wasserstein distance. And finally, performing defect detection on the preprocessed image data by using the improved strip steel surface small target defect detection model.

The embodiment of the invention utilizes SRGAN to preprocess the original image data, thereby improving the image resolution; and then, a new measurement standard NWD is introduced to optimize a strip steel surface small target defect detection model constructed based on YOLOv8, so that the detection accuracy of the small target defect is improved.

In one embodiment, step 2 comprises the steps of:

step 2.1: constructing and generating an countermeasure network model based on an SRGAN algorithm;

step 2.2: training the generated countermeasure network model by utilizing the historical image data to obtain a trained generated countermeasure network model;

step 2.3: and inputting the original image data into the trained generated countermeasure network model to obtain the preprocessed image data.

SRGAN (Super-Resolution Generative Adversarial Network) is an image Super-resolution algorithm based on generating a countermeasure network (GAN). And constructing a generated countermeasure network model through an SRGAN algorithm, training the generated countermeasure network model, and inputting the original image data into the trained generated countermeasure network model to obtain the preprocessed image data.

In the data preprocessing stage, the embodiment of the invention uses the SRGAN network to improve the image resolution so as to improve the image quality, increase the pixel number of a small target, reduce the processing noise and be beneficial to improving the definition of the image edge and the detail of the defect lines.

In one embodiment, generating the countermeasure network model includes:

generating a sub-network and an authentication sub-network;

the generation sub-network comprises a feature extractor, a plurality of residual blocks and an up-sampling layer;

the feature extractor is used for extracting features of the low-resolution image, the residual block is used for enhancing the expression mode, and the up-sampling layer is used for converting the low-resolution image into a super-resolution image;

the authentication sub-network comprises a convolutional neural network;

the SRGAN algorithm construction generation of the countermeasure network model comprises generation of a sub-network and authentication of the sub-network. The generation sub-network adopts a VGG deep convolutional network structure, and comprises a feature extractor, a plurality of residual fast and upsampling layers. The feature extractor is used for extracting features of the original image data, the residual block is used for enhancing the expression mode, and the up-sampling layer is used for converting the low-resolution original image data into high-resolution preprocessed image data. The authentication subnetwork is also a CNN, the main objective of which is to distinguish HR images of the generator from real images. The discrimination sub-network consists of multiple convolution layers and pooling layers, and ultimately outputs a scalar value indicating whether the input image is truly or generated SRGAN uses multiple impairment functions to train the generation network. Of the most important are the resistance Loss functions (universal Loss), expressed as:

In the process,is the antagonism loss function, +.>Is the converted super resolution image in the sampling layer,is the probability of successful conversion in the sampling layer, I ^LR Is the low resolution image before conversion in the sampling layer, N is the number of samples in the training set, and N is the ordinal number of the images in the training set.

In addition to this SRGAN, a content loss function (ContentLoss) is used, which is defined as the euclidean distance between the feature representation of the reconstructed image and the corresponding image. The specific expression of the content loss function is:

in the method, in the process of the invention,is the content loss function, x and y are the coordinates of the image pixel points, phi _i,j () _x,y Is a feature map obtained after activation by a jth convolution layer before an ith maximum pooling layer in a VGG network, W _i,j Is the width of the image, H _i,j Is the height of the image.

Finally, a perceptual loss function is defined as l by a weighted sum of the counterloss function and the content loss ^SR Is a total loss function for generating an countermeasure network model, and the specific expression is as follows:

wherein, I ^SR Is the total loss function that generates the countermeasure network model,is the resistance loss function.

For better graduating, here minimizationThe generation sub-network is enabled to generate more realistic images, so that the authentication sub-network cannot distinguish the real images from the generated images.

The SRGAN training process adopts an alternate training mode, namely training the identification sub-network firstly and then training the generation sub-network, and the two are alternately performed. In each training iteration, the SRGAN inputs the low-resolution image into a generation sub-network to generate a super-resolution image, and inputs the generated super-resolution image and the real super-resolution image into an identification sub-network respectively for judgment. The SRGAN then updates the weight parameters of the generating and authenticating sub-networks by minimizing the loss function. After training is completed, the low-resolution image is input into a generation sub-network to generate a high-quality super-resolution image.

In one embodiment, step 4 comprises the steps of:

step 4.1: constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm;

step 4.2: obtaining a boundary frame of target detection, wherein the boundary frame comprises a prediction frame and a real frame;

step 4.3: based on the prediction frame and the real frame, obtaining a Wasserstein distance;

step 4.4: normalizing the Wasserstein distance to obtain a normalized Wasserstein distance;

step 4.5: and improving the loss function of the strip steel surface small target defect detection model according to the normalized Wasserstein distance.

IoU (Intersection over Union) is an index for measuring the overlapping degree of a predicted frame and a real frame in target detection, and is commonly used for measuring the accuracy of a detection model.

In the task of object detection, it is often necessary to detect a plurality of objects in a picture, and each object is usually represented by a bounding box (bounding box). Assuming that the model predicts a box B, the true box is G, and the IoU index is expressed as the intersection area (intersection area) of two boxes divided by their union area (union area), the specific expression is:

IoU＝intersection area(B,G)/union area(B,G)

in the formula, intersection area (B, G) is the area of the overlapping portion between the predicted frame B and the real frame G, and unionarea (B, G) is the area of the predicted frame B combined with the real frame G.

IoU is a value ranging from 0 to 1, and when IoU is 1, it indicates that the prediction frame completely covers the real frame, and when IoU is 0, it indicates that there is no overlapping portion between the prediction frame and the real frame.

In target detection, a prediction box of IoU greater than a threshold is generally regarded as a correct detection result. For example, when the IoU threshold is 0.5, only when IoU of the predicted frame and the real frame is greater than 0.5, the correct detection result is considered.

IoU itself and its extensions (e.g., CIoU, DIoU, EIoU) are very sensitive to positional deviations of small targets, severely degrading detection performance when used in the YOLOv8 detection algorithm. A new metric NWD (Normalized Wasserstein Distance) is introduced, which is a metric based on wasperstein distance, for comparing the similarity between two probability distributions. Unlike the wasperstein distance, the NWD normalizes the wasperstein distance by a normalization factor, making the NWD easier to compare and interpret. The advantages of NWD are as follows:

(1) Distribution similarity can be measured whether there is overlap between small objects or not;

(2) NWD is insensitive to targets of different scales and is more suitable for measuring similarity between small targets.

Thus, NWD has better performance than IoU in measuring similarity between small targets. The specific contents are as follows.

Assume that there are two bounding boxes a and B, where a= (cx) _a ,cy _a ,w _a ,h _a ) Is N _a， Wherein cx is _a Is the coordinate, cy, of the center point of the bounding box A on the x-axis _a Is the coordinate of the center point of the bounding box A on the y-axis, w _a Is the width of the bounding box A, h _a Is the height of bounding box a.

Bounding box b= (cx _b ,cy _b ,w _b ,h _b ) Is N _b Wherein cx is _b Is the coordinate, cy, of the center point of the bounding box B on the x-axis _b Is the coordinate of the center point of the bounding box B on the y-axis, w _b Is the width of the boundary box B, h _b Is the height of bounding box B. .

In the method, in the process of the invention,is the Wasserstein distance of bounding box A and bounding box B, | … | ₂ Is the Frobenius norm [] ^T Is the transposed matrix, N _a Is a Gaussian distribution modeled by bounding box A, N _b Is a gaussian distribution modeled by bounding box B.

But howeverIs a distance measure and cannot be directly used as a similarity measure (i.e., a value between 0 and 1 is IoU). Thus, we normalized using its exponential form, yielding a new metric, called Normalized Wasserstein Distance (NWD), specific expression:

in the formula, NWD (N _a ,N _b ) Is the normalized Wasserstein distance of bounding box A and bounding box B, and C is a constant closely related to the dataset.

NWD was applied to the non-maximal suppression and loss function of YOLOv 8. The specific cases are as follows:

non-maximum suppression (NMS) based on NWD is an important part in target detection for suppressing redundant prediction bounding boxes. In YOLOv8, a IoU metric is used, which first ranks all prediction blocks according to score. Selecting the prediction frame M with highest score and suppressing significant overlap with M (using a predefined threshold N _t ) Is included in the prediction block. However, due to the sensitivity of IoU to small targets, the IoU value of many prediction frames is lower than N _t Resulting in false positive predictions. To address this problem, NWD effectively overcomes the problem of scale sensitivity using NWD as a new criterion for NMS. Based on the regression Loss of NWD, ioU-Loss was introduced in the YOLO algorithm, eliminating the performance gap between training and testing. However, ioU-Loss cannot be proposed for networks in both casesGradient:

(1) There is no overlap between the predicted bounding box P and the real bounding box G (i.e., |p n g|=0)

(2) The predicted bounding box P entirely contains the true bounding box G or vice versa (i.e., |p n g|=p or G).

Both of these cases are very common in small target detection, and therefore IoU-Loss is not suitable for small target detection. To address this problem, NWD index is designed as a loss function:

L _NWD ＝1-NWD(N _P ,N _G )

wherein L is _NWD Is normalized Wasserstein loss, N _P Is the gaussian distribution of the predicted bounding box P, N _G Is a gaussian distribution of a true bounding box G, NWD (N _P ,N _G ) Is the normalized wasperstein distance of the predicted bounding box P and the real bounding box G.

Compared to IoU, NWD has the following advantages in detecting small targets:

(1) Scale uncertainty;

(2) Position deviation smoothness;

(3) Similarity between non-overlapping or mutually inclusive bounding boxes is measured.

The steps of the new loss algorithm are as follows:

step one: given a prediction bounding box and a target bounding box, the algorithm calculates the square of the difference between the Euclidean distance and the width-height from the center point coordinates of the two bounding boxes, resulting in a Wasserstein distance. The algorithm also introduces a small number eps, which is used to avoid dividing by 0. The wasperstein distance reflects the distance and scale difference between the prediction bounding box and the target bounding box.

Step two: the algorithm weighted averages IoU according to their ratio and the ratio of the wasperstein distances, respectively.

Step three: the algorithm adds the calculated IoU loss to the wasperstein loss to give the final loss value.

The embodiment of the invention adopts the standard normalized Wasserstein distance to improve the loss function of the strip steel surface small target defect detection model, and has the following advantages: 1. distribution similarity can be measured regardless of the overlap between small objects. 2. NWD is insensitive to targets of different scales and is more suitable for measuring similarity between small targets.

In one embodiment, the strip surface small target defect detection model comprises:

the strip steel surface small target defect detection model is constructed based on the YOLOv8 algorithm and comprises an input layer, a backbone feature extraction sub-network, a neck feature fusion layer and a detection head.

The C2f module refers to the design of the C3 module of the Yolov5 and the ELAN of the Yolov7, so that the light weight of the Yolov8 is ensured, and meanwhile, rich gradient flow information is obtained.

The backbone feature extraction network comprises a basic convolution unit (CBS), a residual network unit (C2 f) and a spatial pyramid pooling fusion unit (SPPF). The basic convolution unit is used for completing convolution operation, batch normalization and activation operation.

The residual network element is used for feature extraction, the key feature of which is the use of residual connections, which allows the network to learn complex features by combining information from different layers. The residual network unit refers to the C3 module of the YOLOv5 and the ELAN idea of the YOLOv7 to design, so that the light weight of the YOLOv8 is ensured, and meanwhile, rich gradient flow information is obtained. Thus, the residual network element (C2 f) refers to the C3 module and the ELAN module. As shown in fig. 3 and 4, wherein the C3 module uses 3 convolution modules (conv+bn+silu), and n BottleNeck. The ELAN module uses group convolution to increase the radix of the newly added feature and combines the different sets of features in a shuffle combining radix manner. The operation mode can enhance the features learned through different feature maps and improve the utilization rate of parameters and calculation. The SPPF module outputs larger and more accurate feature representations, which is helpful for capturing fine granularity and coarse granularity features from an input feature map and improving the accuracy of a target detection task. The structure is that three MaxPool are connected in series, and the convolution kernel size is 5*5. The Neck (Neck feature fusion network) part adopts a FPN+PAN structure. The FPN context can convey powerful semantic information and improve detection performance. The PAN supplements the FPN in a bottom-up configuration and passes the underlying strong positioning features up to improve the positioning signal. The FPN+PAN structure enables the model to obtain larger and more accurate feature representations output by the rich feature information space pyramid pooling fusion unit, is beneficial to capturing fine-granularity and coarse-granularity features from an input feature map, and improves the accuracy of a target detection task. The structure is that three MaxPool are connected in series, and the convolution kernel size is 5*5. The detection Head (Head) is a key component of the YOLOv8 algorithm and is used for performing multi-scale target detection on the feature map extracted from the trunk part so as to obtain a detection result. The detection head adopts a decoupling structure.

The loss functions of YOLOv8 include classification loss functions and regression loss functions. The classification loss function is a zoom loss function (VFL), and the ciousness of the regression loss function is in the form of a distributed focus loss function (Distribution Focal Loss, DFL). In a single-stage object detection algorithm, the object detector usually generates a frame of 10k order, but only a few positive samples, which can lead to the situation that the number of positive and negative samples is unbalanced and the number of difficult samples is unbalanced, and Focal loss Function (FL) is proposed to solve the problem.

The focus loss function FL is a sort branch service for single-stage detectors, which supports discrete class labels of either 0 or 1.

The purpose is to solve the situation of unbalanced sample number:

(1) Positive sample loss increases and negative sample loss decreases;

(2) The loss of difficult samples increases and the loss of simple samples decreases.

General classification typically uses a cross entropy loss function, which balances the number of positive and negative samples, but in practice a large number of candidate targets in target detection are frangible samples. The losses of these samples are low, but the number of the easily separable samples is relatively large due to the extreme imbalance of the number, ultimately leading to the total loss.

Thus, the effect of the easily separable samples (i.e., samples with high confidence) on the model is very small, and the model should be focused on those samples that are difficult to separate. A simple idea is that we only reduce the loss of the high confidence samples somewhat, so the specific expression for the focus loss function is:

where p represents the prediction score, y represents the form-optimized label of cross entropy, γ is the modulation factor, and Focalloss is the focus loss function.

The focus loss function is peer-to-peer (i.e., weighted using the same parameters) for positive and negative samples, while the zoom loss function (VFL) uses asymmetric parameters to weight positive and negative samples. The zoom loss function processes unequal foreground and background modes by attenuating only negative samples, and the specific expression is as follows:

where varifocal () is a zoom loss function, q is a target fraction, γ is a modulation factor, and α is a target weight.

If the positive samples IoU are high, the contribution to loss is greater, allowing the network to focus on those samples of high quality, i.e., training high quality positive examples to boost the AP more than low quality. For negative samples, p is used ^γ The weight is reduced, so that the contribution of the negative sample to the loss function is reduced, and the contribution of the negative sample to the whole loss function can be reduced because the prediction p of the negative sample is certainly smaller and smaller after the power is obtained.

The distribution focus loss function (Distribution Focal Loss, DFL) of the regression loss function is defined as optimizing the probability of the nearest left-right 2 locations of the tag y in the form of cross entropy, thereby allowing the network to focus faster on the distribution of the vicinity of the target location, as defined by:

DFL(S _i ,S _i+1 )＝-((y _i+1 -y)lg(S _i )+(y-y _i )lg(S _i+1 ))

where DFL () is the distributed focus loss function, y is the form optimization label of cross entropy, y _i And y _i+1 Respectively, the nearest left and right 2 positions of the form optimized label y, S _i Is y _i Sigmoid function of S _i+1 Is y _i+1 Is a sigmoid function of (c).

DFLs can allow the network to focus more quickly on values near the target y, increasing their probability.

According to the embodiment of the invention, through constructing a strip steel surface small target defect detection model, a residual network unit refers to the design of the C3 module of YOLOv5 and the ELAN thought of YOLOv7, so that the light weight of YOLOv8 is ensured, rich gradient flow information is obtained, and meanwhile, VFL is respectively used as a classification loss function and DFL is used as a regression loss function to rapidly classify and focus.

In one embodiment, the method for detecting the small target defect on the surface of the strip steel based on super resolution and YOLOv8 further comprises the following step 6: performing label distribution on the target boundary frame based on a target detection label method;

wherein, step 6 comprises the following steps:

step 6.1: acquiring a target boundary box in a target detection label method;

step 6.2: marking the areas with the same shape and size and the same center of the target boundary box as positive labels on the preprocessed image data, and marking the rest areas as negative labels;

step 6.3: and inputting the preprocessed image data completed by the label into a strip steel surface small target defect detection model.

when the target bounding box is marked with feature labels, label noise is easily generated during training because the marked features may be on the background or occluding object. A new label strategy is therefore introduced to suppress the label noise of YOLOv 8.

YOLOv8 limits the prediction of the target bounding box by assigning it to the appropriate FPN level according to its scale or target regression distance. Thus, two different regions are constructed for each target bounding box, a "positive region" is defined as a region concentric with and shaped the same as the target bounding box, and the size of the "positive region" is experimentally set. All features that spatially fall within the "positive region" of the target bounding box are then identified as positive features, with the remainder being negative features. Each forward feature is assigned to the target bounding box that contains it.

The embodiments of the present invention assign positive features to the target instances of the boxes in which they are located, and at this time, feature assignment in the intersection region of different real bounding boxes (GT boxes) is a problem that needs to be handled. In this case, embodiments of the present invention would assign these features to the GT boxes that have the smallest distance from their center. Similar to other anchor-free methods, embodiments of the present invention train each foreground feature assigned to a target object to predict the coordinates of the GT frame of its target.

In one embodiment, the method for detecting the small target defect on the surface of the strip steel based on super resolution and YOLOv8 further comprises the following step 7: constructing a process improvement scheme model, inputting a defect detection result into the constructed process improvement scheme model to obtain a process improvement table, and improving the strip steel processing process according to the process improvement table;

wherein, step 7 includes the following steps:

step 7.1: constructing a process improvement scheme model;

wherein, the technology improvement scheme model includes:

a defect judging unit for judging whether a defect exists;

a defect type unit for judging the defect type in case of defect;

step 7.2: inputting the result of defect detection into a process improvement scheme model;

step 7.3: based on the result of the defect detection, the process improvement scheme model judges whether a defect exists, if so, judges the defect type and outputs a process improvement table.

and constructing a process improvement scheme model, judging whether a defect exists or not by utilizing the process improvement scheme model to judge the type of the defect if the defect exists. And according to the defect type contained on the surface of the strip steel, retrieving a process improvement table according to the defect type. The strip steel processing technology is adjusted through a technology improvement table. For example, if the defect type is "impurity-rich", the "clear impurities in pretreatment of strip steel" and the successful cases are retrieved based on the "impurity-rich". And inputting and generating a process improvement table. And (5) obtaining a process improvement table by engineers to improve the strip steel processing process.

The embodiment of the invention constructs a process improvement scheme model and generates a process improvement table based on the detection result of the small target defect on the surface of the strip steel, which is beneficial to analyzing and changing the detection result of the defect.

In one embodiment, step 3 comprises:

step 3.1: acquiring preprocessing image data to form a preprocessing image data set;

step 3.2: image data enhancement is carried out on the preprocessed image data set by utilizing a CutMix method, so that a first enhanced image set is obtained;

step 3.3: preprocessing the image dataset using the Mosaic method according to 4:1, carrying out image data enhancement according to the proportion to obtain a second enhanced image set;

step 3.4: preprocessing the image dataset using the Mosaic method according to 9:1, carrying out image data enhancement according to the proportion to obtain a third enhanced image set;

step 3.5: and randomly extracting image data in the first data image set, the second enhanced image set and the third enhanced image set based on the set extraction proportion to form an enhanced training data set.

and performing image data enhancement on the preprocessed image data set by using a CutMix method to obtain a first enhanced image set. Preprocessing the image dataset using the Mosaic method according to 4:1 to obtain a second enhanced image set. Preprocessing the image dataset using the Mosaic method according to 9:1 to obtain a third enhanced image set.

The data set images are randomly allocated for data enhancement according to the set extraction ratio of the first enhanced image set to the second enhanced image set to the third enhanced image set=2:4:4.

According to the embodiment of the invention, the training samples are processed in a mode of combining CutMix, mosaic with traditional global data enhancement, so that the number of small target samples is enriched.

In one embodiment, step 1 comprises:

step 1.1: an upper surface camera is arranged above the upper surface of the strip steel, and image data of the upper surface of the strip steel are collected;

step 1.2: a lower surface camera is arranged below the lower surface of the strip steel, and image data of the lower surface of the strip steel are collected;

step 1.3: and (3) based on the same area of the strip steel, carrying out symmetrical connection and fusion on the image data of the upper surface of the strip steel and the image data of the lower surface of the strip steel to obtain the original training data.

a certain area of the strip steel is selected, wherein the area is square, and four vertexes of the square are A ', B', C 'and D', respectively. The image data of the upper surface and the lower surface of the region are respectively collected, and a certain edge of the region is used for symmetrical connection and fusion, for example, an A 'B' edge is used for symmetrical connection and fusion, and the image data are used as original training data.

According to the embodiment of the invention, the two cameras are respectively arranged on the upper surface and the lower surface of the strip steel. The image data acquired by the two cameras are symmetrically connected and fused and used as original training data, and the upper surface and the lower surface of a certain area of the strip steel can be detected at the same time.

The invention also provides a strip steel surface small target defect detection system based on super resolution and YOLOv8, as shown in fig. 5, comprising:

the acquisition module 1 is used for acquiring original training data;

the preprocessing module 2 is used for preprocessing the original training data based on an SRGAN algorithm to obtain preprocessed image data;

an image enhancement module 3, configured to construct a preprocessed image dataset based on the preprocessed image data and enhance the image data, to obtain an enhanced training dataset;

the model construction module 4 is used for constructing a strip steel surface small target defect detection model based on a YOLOv8 algorithm, improving the strip steel surface small target defect detection model by using a normalized Wasserstein distance, and training the improved strip steel surface small target defect detection model by using an enhanced training data set;

and the defect testing module 5 is used for acquiring testing data and inputting the testing data into the improved strip steel surface small target defect detection model to detect defects so as to obtain a testing result.

the preprocessing module 2 converts the original image data from a low-resolution image to a Super-resolution image by using an SRGAN algorithm (Super-Resolution Generative Adversarial Network based on generating an antagonistic network image Super-resolution algorithm), the image enhancement module 3 performs image data enhancement on a preprocessed image data set formed by the preprocessed image data to obtain an enhanced training data set, and the model construction module 4 constructs a strip steel surface small target defect detection model and improves the strip steel surface small target defect detection model by using a normalized Wasserstein distance. The defect detection module 5 detects defects of the preprocessed image data by using the improved small target defect detection model of the strip steel surface.

According to the embodiment of the invention, the original image data is converted into the super-resolution image through the SRGAN algorithm, and the normalized Wasserstein distance is utilized to improve the small target defect detection model on the surface of the strip steel, so that the detection accuracy of the small target defect is improved.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The strip steel surface small target defect detection method based on super resolution and YOLOv8 is characterized by comprising the following steps of:

acquiring original training data;

2. The method for detecting the small target defects on the surface of the strip steel based on super resolution and YOLOv8, according to claim 1, wherein the original training data is preprocessed based on an SRGAN algorithm to obtain preprocessed image data; comprising the following steps:

3. The super-resolution and YOLOv8 based strip steel surface small target defect detection method of claim 2, wherein creating an countermeasure network model based on SRGAN algorithm construction comprises:

generating a sub-network and an authentication sub-network; the generation sub-network comprises a feature extractor, a plurality of residual blocks and an up-sampling layer; the feature extractor is used for extracting features of the low-resolution image, the residual block is used for enhancing the expression mode, and the up-sampling layer is used for converting the low-resolution image into a super-resolution image; the authentication subnetwork comprises a convolutional neural network.

4. The method for detecting the small target defects on the surface of the strip steel based on super resolution and YOLOv8, as claimed in claim 3, wherein a small target defect detection model on the surface of the strip steel is constructed based on a YOLOv8 algorithm, the small target defect detection model on the surface of the strip steel is improved by using a normalized Wasserstein distance, and the improved small target defect detection model on the surface of the strip steel is trained by using enhanced training data; comprising the following steps:

5. The method for detecting small target defects on the surface of a strip steel based on super resolution and YOLOv8 as claimed in claim 4, wherein the model for detecting small target defects on the surface of the strip steel comprises:

6. The method for detecting small target defects on the surface of a strip steel based on super resolution and YOLOv8 of claim 5, further comprising:

obtaining a boundary box of target detection;

7. The method for detecting small target defects on the surface of a strip steel based on super resolution and YOLOv8 of claim 6, further comprising:

constructing a process improvement scheme model;

Wherein, the technology improvement scheme model includes:

a defect judging unit for judging whether a defect exists;

a defect type unit for judging the defect type in case of defect;

8. The method for detecting the small target defects on the surface of the strip steel based on super resolution and YOLOv8 according to claim 7, wherein a preprocessed image dataset is constructed based on preprocessed image data and image data enhancement is performed to obtain an enhanced training dataset; comprising the following steps:

acquiring preprocessing image data to form a preprocessing image data set;

Preprocessing the image dataset using the Mosaic method according to 4:1, carrying out image data enhancement according to the proportion to obtain a second enhanced image set;

preprocessing the image dataset using the Mosaic method according to 9:1, carrying out image data enhancement according to the proportion to obtain a third enhanced image set;

9. The method for detecting the small target defects on the surface of the strip steel based on super resolution and YOLOv8, according to claim 8, wherein raw training data are obtained; comprising the following steps:

10. A strip steel surface small target defect detection system based on super resolution and YOLOv8 is characterized by comprising:

the acquisition module is used for acquiring the original image data;