CN116012372A

CN116012372A - Aluminum surface real-time defect detection method and system based on improved YOLOv5

Info

Publication number: CN116012372A
Application number: CN202310174707.3A
Authority: CN
Inventors: 唐立军; 刘深波; 赵东雪
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2023-02-28
Filing date: 2023-02-28
Publication date: 2023-04-25

Abstract

The invention discloses an aluminum surface real-time defect detection method and system based on improved YOLOv5, comprising the following steps: obtaining various aluminum surface defect pictures, carrying out data enhancement picture preprocessing, and establishing a data set; inputting the preprocessed data set into a improved YOLOv5 network structure of a ghost network, a joint attention mechanism and a depth separable convolution, and performing enhanced migration training to obtain an optimized model; and deploying the optimization model on hardware equipment to detect the surface defects of the aluminum product in real time. The invention solves the problems of low real-time performance, to-be-improved detection precision and low training efficiency of aluminum surface defect detection.

Description

Aluminum surface real-time defect detection method and system based on improved YOLOv5

Technical Field

The invention relates to the technical field of metal defect detection, in particular to an aluminum surface real-time defect detection method and system based on improved YOLOv 5.

Background

The aluminum profile has the characteristics of good heat conductivity, moisture resistance and the like, and becomes an important base material in the fields of buildings, vehicles, ships, houses and the like. With the rapid development of related industries, the demand for high-quality aluminum profiles is also increasing. Therefore, the detection of the surface defects of the aluminum profile is quite significant, and the surface defects of the aluminum profile directly affect the quality of products. Traditional manual visual detection is difficult to ensure the accuracy of detection results and the high efficiency of the detection process, and a series of problems such as low efficiency, physiological fatigue of people and the like can be generated in manual processing. To address these issues, some students apply machine learning methods to the identification of industrial defects. Yu et al, using SVM to classify wood surface defects, hu et al, propose an algorithm based on ellipse fitting and distance threshold to detect steel shell surface pit defects, chen et al, using smooth filtering to detect steel plate surface defects, wang et al, propose an aluminum foil defect detection algorithm based on SUSAN operator. Although the above work achieves good results in surface defect detection, there are some problems such as poor robustness, large limitation and low real-time performance, and the data set needs to be collected and constructed for the surface defect of the aluminum material, so that the workload is large, the fitting phenomenon is caused by insufficient data, the training effect is affected by less characteristic information, and the final detection precision is affected.

Disclosure of Invention

First, the technical problem to be solved

Based on the problems, the invention provides an aluminum surface real-time defect detection method and system based on improved YOLOv5, which solve the problems of low real-time performance, to-be-improved detection precision and low training efficiency of aluminum surface defect detection.

(II) technical scheme

Based on the technical problems, the invention provides an aluminum surface real-time defect detection method based on improved YOLOv5, which comprises the following steps:

s1, obtaining pictures of various aluminum surface defects, performing data-enhanced picture preprocessing, and establishing a data set;

s2, inputting the preprocessed data set into a right network, combining an attention mechanism and an improved YOLOv5 network structure of the depth separable convolution, and performing enhanced migration training to obtain an optimized model; the improved YOLOv5 network structure that introduces a ghost network, joint attention mechanism and depth separable convolution includes:

the input picture passes through a CBL, a first Ghost module, a first C3Ghost module, a second C3Ghost module, a third C3Ghost module, a fourth Ghost module and a fourth C3Ghost module, and attention modules are embedded in the four C3Ghost modules; the output of the second C3Ghost module and the output Concat sampled from the first Dw convolution module output a first scale feature map through a fifth C3Ghost module; after the output of the third C3Ghost module and the output Concat sampled from the third Dw convolution module, the output after passing through the sixth C3Ghost module and the first Dw convolution module and the output Concat after passing through the second Dw convolution module of the fifth C3Ghost module output a second scale feature map through the seventh C3Ghost module; the output of the fourth C3Ghost module is output by the SPPF and the third Dw convolution module, and after the output Concat of the seventh C3Ghost module is output by the fourth Dw convolution module, a third scale feature map is output by the eighth C3Ghost module; the C3Ghost module is a C3 module added with Ghost, and the Dw convolution module is a depth separable convolution module;

and S3, deploying the optimization model on hardware equipment to detect the surface defects of the aluminum product in real time.

Further, the step S2 includes:

s21, inputting the preprocessed or further reinforced data set into a improved YOLOv5 network structure of a ghost network, a joint attention mechanism and a depth separable convolution, and obtaining a pre-training model by utilizing transfer learning;

s22, judging whether the accuracy of various defects reaches a set threshold, if not, entering a step S23, and if so, entering a step S24;

s23, further strengthening the data set corresponding to the defect of which the accuracy rate does not reach the set threshold value, and returning to the step S21 to continue training;

s24, training to obtain an optimization model.

Further, in the step S23, fine adjustment of parameters is further included.

Further, in the step S22, the set threshold is 95%.

Further, the various defects include pinholes, dirt, and scratches.

Further, the attention module adopts a channel attention module CAM and a space attention module SAM which are connected in parallel, the height, the width and the channel number of an input feature map F are H, W and C respectively, global space information of F is compressed through Max Pool and Avg Pool in the CAM to generate two feature maps S1 and S2 with the size of 1 multiplied by C, the feature maps S1 and S2 obtain two one-dimensional feature maps through MLP, and the two one-dimensional feature maps are normalized to obtain a weight feature map MC; meanwhile, in the SAM, firstly, a result is input into a Sigmoid function through a convolution module of 1 multiplied by 1 to be activated to obtain a weight feature map MS; finally, the weight feature map MC and the MS are connected in parallel in an element-by-element addition mode, and an output feature map F-A is obtained after a Sigmoid activation function is carried out.

Further, the preprocessing of the picture in step S1 includes:

pasting the aluminum surface defect pictures of different defect objects with different scales to a new background picture;

carrying out Gaussian blur adjustment on the aluminum surface defect picture;

the method also comprises at least one of the following treatment methods:

randomly selecting four pictures of the aluminum surface defect picture, and performing reinforcement splicing through mosaics to obtain a new picture;

brightness adjustment is carried out on the aluminum surface defect picture;

carrying out rotation angle adjustment on the aluminum product surface defect picture;

cutting and adjusting the aluminum product surface defect picture;

carrying out translation adjustment on the aluminum surface defect picture;

and carrying out mirror image adjustment on the aluminum product surface defect picture.

Further, the input picture has been uniformly adjusted to have a channel number of 640×640×3.

Further, the number of channels in the first scale feature map is 255×80×80, the number of channels in the second scale feature map is 255×40×40, and the number of channels in the third scale feature map is 255×20×20.

The invention also discloses an aluminum surface real-time defect detection system based on the improved YOLOv5, which comprises: at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the improved YOLOv 5-based aluminum surface real-time defect detection method.

(III) beneficial effects

The technical scheme of the invention has the following advantages:

(1) According to the invention, through an improved YOLOv5 network structure, a ghost network is introduced into the YOLOv5 network to reduce model parameterization, an attention module introducing a joint attention mechanism reduces the total parameter quantity of the model to simplify the model, a depth separable convolution is introduced to further reduce the model parameter quantity, so that the model is light in weight, the calculated quantity is reduced, the model training speed is accelerated, the detection instantaneity is improved, and a data set is further strengthened through reinforcement migration learning during model training, so that the training result of each defect is satisfactory, the detection accuracy is further improved, the training efficiency is improved, and the detection instantaneity is improved;

(2) The invention strengthens the characteristic information in the data set, in particular to small target characteristic information such as pinhole defect by pasting and Gaussian blur oversampling, and solves the problems of less characteristic information and poor training effect, thereby further improving the detection precision;

(3) According to the invention, through the Mosaic data enhancement and the traditional data enhancement processing, the data set is effectively enlarged, the overfitting phenomenon caused by insufficient data set quantity is solved, and the detection precision is further improved.

Drawings

The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and should not be construed as limiting the invention in any way, in which:

FIG. 1 is a conceptual diagram of an embodiment of an improved YOLOv 5-based method for real-time defect detection of aluminum surfaces;

FIG. 2 is a schematic diagram of preprocessing a picture with stuck and Gaussian blur according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of image preprocessing after clipping, masking, and noise addition according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a preprocessing of a picture with random flipping and brightness change according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a structural network of improved YOLOv5 according to an embodiment of the present invention;

FIG. 6 is a network schematic diagram of a Ghost module according to an embodiment of the invention;

FIG. 7 is a network diagram of an attention module according to an embodiment of the present invention;

FIG. 8 is a network schematic of a depth separable convolution according to an embodiment of the present invention;

FIG. 9 is a bar graph of average recognition accuracy of various types of defects in accordance with an embodiment of the present invention;

fig. 10 is a schematic diagram of a detection result according to an embodiment of the present invention.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.

An embodiment of the invention is an aluminum surface real-time defect detection method and system based on improved YOLOv5, as shown in fig. 1, comprising the following steps:

s1, obtaining pictures of various aluminum surface defects, performing data-enhanced picture preprocessing, and establishing a data set; the data-enhanced picture preprocessing includes:

carrying out Gaussian blur adjustment on the aluminum surface defect picture;

and comprises at least one of the following treatment methods:

brightness adjustment is carried out on the aluminum surface defect picture;

cutting and adjusting the aluminum product surface defect picture;

carrying out translation adjustment on the aluminum surface defect picture;

In the embodiment, firstly, pasting aluminum sheet defect pictures of different objects with different scales to a new background image; the pasting technology can rapidly acquire rich and novel training data; then, carrying out Gaussian blur processing on the pasted picture, wherein Gaussian blur is an image blur filter which calculates the transformation of each pixel in the image by normal distribution; as shown in FIG. 2, the feature information in the data set, especially the small target feature information such as pinhole defects, is enhanced by performing the oversampling processing through pasting and Gaussian blur, the number of pictures of the pinhole defects is increased, the picture ratio occupied by the pinhole defects is increased, and the problems of less feature information and poor training effect are solved.

Then four pictures are randomly selected through the Mosaic data enhancement, a new picture is spliced, each picture is provided with a corresponding frame, the spliced new picture also is provided with the corresponding frame of the picture, the new picture is transmitted into a neural network for learning, and the learning is equivalent to the learning of four pictures at one time; finally, traditional data enhancement means such as brightness change, rotation angle change, cut-out, translation, mirror image, shielding, noise addition and the like are processed; as shown in fig. 3 and 4, through these operations, the data volume of the data set is enlarged by 10 times, the overfitting phenomenon caused by the insufficient data volume of the data set is solved, and the data set is divided into a training set and a test set for subsequent training learning.

In this embodiment, various defects including pinholes, dirt, scratches, etc., and other aluminum surface defect types are selected as required.

S2, inputting the preprocessed data set into a right network, combining an attention mechanism and an improved YOLOv5 network structure of the depth separable convolution, and performing enhanced migration training to obtain an optimized model; the method comprises the following steps:

the transfer learning is conventional transfer learning, a pre-training module is obtained through conventional transfer learning, and compared with imitation learning and life learning, the conventional transfer learning method greatly improves average accuracy and shortens training time.

judging whether the training results of various defects of the pre-training model are satisfactory or not through the test set, and aiming at the defects of which the training results are unsatisfactory, entering a step S23, further strengthening the data set of the defects, returning to the step S21, and continuing training to obtain a new pre-training model until the training results of various defects are satisfactory, wherein the pre-training model is a final optimization model.

aiming at the defect that a certain training result is unsatisfactory, for example, the accuracy of pictures of pinholes and dirt defects does not reach a set threshold, namely, 95%, the data sets corresponding to the pinholes and dirt defects are further strengthened by the strengthening method in the step S1, after parameter fine adjustment is carried out, the step S21 is returned to continue the training and the subsequent training by utilizing the migration learning, and the data sets further strengthened according to the defect that the training result is unsatisfactory and the pretraining model at the moment are continuously subjected to the migration learning training, namely, the strengthening migration learning is carried out, so that the training results of various defects are satisfactory, the average accuracy is further improved, and the training duration is shortened.

S24, training to obtain an optimization model.

For an improved YOLOv5 network structure, a new detection algorithm is provided based on a YOLOv5 algorithm, a backlight layer is constructed based on a Ghost module on the basis of the YOLOv5 algorithm, and an attention mechanism module is embedded in the Ghost module, so that compression of a backlight Backbone network is realized, and more channels and spatial information are focused; and then the Neck network is compressed by utilizing the depth separable convolution, so that the model volume is further reduced. As shown in fig. 5, the method comprises:

the input picture passes through a CBL, a first Ghost module, a first C3Ghost module, a second C3Ghost module, a third C3Ghost module, a fourth Ghost module and a fourth C3Ghost module, and attention modules are embedded in the four C3Ghost modules; the output of the second C3Ghost module and the output Concat sampled from the first Dw convolution module output a first scale feature map through a fifth C3Ghost module; after the output of the third C3Ghost module and the output Concat sampled from the third Dw convolution module, the output after passing through the sixth C3Ghost module and the first Dw convolution module and the output Concat after passing through the second Dw convolution module of the fifth C3Ghost module output a second scale feature map through the seventh C3Ghost module; the output of the fourth C3Ghost module is output by the SPPF and the third Dw convolution module, and after the output Concat of the seventh C3Ghost module is output by the fourth Dw convolution module, a third scale feature map is output by the eighth C3Ghost module; the C3Ghost module is a C3 module added with Ghost, the Dw convolution module is a depth separable convolution module, concat represents the dimension addition of two images, CBL is a basic structure formed by Conv+Bn+Leaky_relu activation functions, and SPPF is Spatial Pyramid Pooling-Fast space pyramid pooling-Fast.

According to the network model, the input picture is uniformly adjusted to 640 x 3 channels, and three scale feature pictures are extracted on different layers through a backbone network; the output of the second C3Ghost module is a 128×80×80 image, the processed output is 255×80×80 first scale feature map, the output of the third C3Ghost module is a 256×40×40 image, the processed output is 255×40×40 second scale feature map, the output of the fourth C3Ghost module is 512×20×20 image, and the processed output is 255×20 third scale feature map.

And introducing a Ghost network into the YOLOv5 algorithm, and reducing the model parameter number. The Ghost network is a network structure based on a Ghost convolution module. The conventional CNN structure usually achieves ideal precision through a large number of floating point operations, and the lightweight models such as MoblieNet, sheffleNet and the like reduce floating point calculation amount, but redundant feature maps generated by convolution are not effectively processed. As shown in FIG. 6, the Ghost convolution module generates some basic original feature maps through 1×1 normal convolution operation, and then performs phi on the feature maps one by one ₁ 、Φ ₂ 、……、Φ _k And (3) performing linear transformation, obtaining another part of redundant feature map, fusing the part of redundant feature map with the original feature map, and increasing the number of channels. This way of obtaining redundant signatures by linear operations can produce those redundant signatures with less cost than ordinary convolution. The total number of parameters of the model is reduced in this way to simplify the model.

The joint attention mechanism UAM, namely the attention module, is embedded in the model structure, so that the loss of model precision in the model compression process can be reduced. The joint attention mechanism UAM is formed by parallel connection of a channel attention module CAM and a space attention module SAM. The parallel structure encodes the feature map information in the space dimension and the channel dimension at the same time, so that the information between the feature map channel and the space can be better utilized. A detailed UAM diagram is shown in FIG. 7, wherein F is an input feature diagram, and H, W, and C are the height, width, and channel number of the input feature diagram, respectively. In CAM, firstly, compressing global space information of F through Max Pool and Avg Pool to generate two feature graphs S1 and S2 with the size of 1×1×C; then obtaining two one-dimensional feature maps through Multi Layer Perception (MLP); and normalizing the two one-dimensional feature maps to obtain a weight feature map MC. Meanwhile, in the SAM, firstly, a convolution module of 1 multiplied by 1 is adopted, and then inputting the result into a Sigmoid function for activation to obtain a weight characteristic diagram MS. Finally, MC and MS are connected in parallel in an element-by-element addition mode, and an output feature map F-is obtained after a Sigmoid activation function is carried out.

Depth separable convolution (depthwise separable convolution, dwConv) was used in the NECK network of YOLOv5 to further reduce model parameters. As shown in fig. 8, the depth separable convolution consists of two parts, a depth convolution (Depth wise convolution, DWConv) and a point state convolution (Point wise convolution, PWConv). Assuming that the input data has a high-rate width of 7 x 1 x 3, performing a deep convolution first, and performing convolution operation by using 3 x 1 convolution kernels because of the number of input channels being 3, wherein the feature map output is 5 x 1 x 3, the calculated amount of the deep convolution process is 1 x 3 x 5=45, and the parameter training number is 1 x 3 x 3=9; then, to obtain 16 feature quantities, 16 convolution kernels of 1×1×3 are applied to perform convolution operation, the final feature graph output is 5×1×16, the calculated quantity of the point convolution process is 1×3×5×16=240, and the number of parameter training is 3×16=48. From the above calculation, the total calculation amount of the depth separable convolutional neural network training process is 285, and the total parameter amount is 57. If the conventional convolution is performed, the total calculated amount is 1×3×3×5×16=720, the total parameter amount is 1×3×3×16=144, which is far higher than the calculated amount and the parameter training number of the depth separable convolution. Therefore, compared with the traditional convolution, the depth separable convolution can effectively reduce the calculated amount, reduce the number of parameter training and accelerate the model training speed.

The optimization model is deployed to the hardware device. The hardware equipment includes a LED light source, a CCD image sensor, a 7 cun touch-sensitive screen, a main control computer NVIDIAJetson Nano, an encoder, a conveyer belt and two power, the main control computer is connected respectively light source, image sensor, touch-sensitive screen and encoder, the conveyer belt is used for conveying the aluminum product that waits to detect, the light source is located the conveyer belt side, the aluminium product that waits to detect of passing on the direction conveyer belt is located conveyer belt side another side, scans the aluminum product that waits to detect of passing on the conveyer belt, imaging element includes imaging lens and image sensor, and wherein, image sensor is used for carrying out the image scanning with imaging lens, sends the main control computer and handles. The main control machine NVIDIAJetson Nano compares the performance parameters of Nano and TX 2to find that the running power of Nano is as low as 10W, and the performance of Nano as high as 32TOPS is 20 times that of TX 2. The 8-core ARM v 8.2-bit CPU, the 512-core Volta GPU and the 7-channel VLIW vision processor owned by the Nano ensure the strong computing capacity, and the high computing performance is also suitable for a large amount of image processing and training and running of a deep learning model. And because of its low power consumption, can ensure under the outdoor test condition, ensure the long-time, effective of experiment to go on. NVIDIAJetson Nano as an embedded system for new generation autonomous machines, the performance provided by the system can increase the running speed of the autonomous machine software and lower the power consumption. Each system is a complete modular System (SOM) with CPU, GPU, PMIC, DRAM and memory, development time and money can be saved, expandability is achieved, and a custom system can be built on the basis of selecting SOM suitable for application occasions. The Nano plays a role in an aluminum surface real-time defect detection system, mainly comprises the following 3 aspects: (1) loading and running a main program; (2) The device is used as central control equipment, and is connected with a display screen and a camera through various interfaces provided by a USB expansion port and a development kit carrier plate; (3) And processing and analyzing the image, training and running the deep learning model.

In the embodiment, 100 pictures in the original data set are used as a test set for detection, and all experiments are performed in a Windows 11 operating system equipped with an Intel i7-12700 CPU and a NVIDIAGeForce RTX 309024GB GPU. These methods are implemented in Python 3.8 and use Pytorch 1.11 as the neural network framework. In order to ensure the accuracy of the training results, the algorithms involved in comparison are tested under the same training parameters. Model training parameters were set as follows: the batch size was 32, the learning rate was 0.0025, and the momentum was 0.937,weight decay and was 0.0005. The map (0.5, 11 point) is used as the detection index, the average recognition accuracy of the final defects is as shown in fig. 9, the pinhole recognition accuracy is 94.4%, the scratch is 98.3%, the dirt is 99.5%, and the average overall accuracy is 97.4%. Compared with the YOLOv5s model, the convolution model of the embodiment is reduced in size by 74.3 percent, which is only 3.7MB, and the average accuracy of the map overall reaches 97.4 percent. An automatic detection aluminum surface defect detection device is manufactured, the device can detect the aluminum surface defect in real time, the detection frame rate is 24.5fps, the equipment weight is only 0.85kg, and the carrying is convenient. The visual inspection results are shown in fig. 10.

Finally, it should be noted that the above-mentioned detection method may be converted into software program instructions, which may be implemented by using a detection system including a processor and a memory, or by using computer instructions stored in a non-transitory computer readable storage medium. The integrated units implemented in the form of software functional units described above may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In summary, the method and the system for detecting the real-time defect of the aluminum surface based on the improved YOLOv5 have the following beneficial effects:

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims

1. An aluminum surface real-time defect detection method based on improved YOLOv5 is characterized by comprising the following steps:

2. The method for detecting real-time defects on an aluminum surface based on modified YOLOv5 of claim 1, wherein the step S2 comprises:

s24, training to obtain an optimization model.

3. The method for detecting real-time defects on an aluminum surface based on modified YOLOv5 of claim 2, further comprising fine-tuning parameters in step S23.

4. The method for detecting real-time defects on an aluminum surface based on modified YOLOv5 according to claim 2, wherein the set threshold value is 95% in the step S22.

5. The improved YOLOv 5-based real-time defect detection method of aluminum surfaces of claim 2, wherein the defects include pinholes, dirt and scratches.

6. The method for detecting the real-time defects on the aluminum surface based on the improved YOLOv5 according to claim 1, wherein the attention module adopts a channel attention module CAM and a space attention module SAM which are connected in parallel, the height, the width and the channel number of an input feature map F are H, W and C respectively, global space information of F is compressed by Max Pool and Avg Pool in the CAM firstly, two feature maps S1 and S2 with the size of 1 multiplied by C are generated, the feature maps S1 and S2 obtain two one-dimensional feature maps by MLP, and the two one-dimensional feature maps are normalized to obtain a weight feature map MC; meanwhile, in the SAM, firstly, a result is input into a Sigmoid function through a convolution module of 1 multiplied by 1 to be activated to obtain a weight feature map MS; finally, the weight feature map MC and the MS are connected in parallel in an element-by-element addition mode, and an output feature map F-A is obtained after a Sigmoid activation function is carried out.

7. The improved YOLOv 5-based real-time defect detection method of aluminum surfaces of claim 1, wherein the picture preprocessing in step S1 comprises:

carrying out Gaussian blur adjustment on the aluminum surface defect picture;

and comprises at least one of the following treatment methods:

brightness adjustment is carried out on the aluminum surface defect picture;

cutting and adjusting the aluminum product surface defect picture;

carrying out translation adjustment on the aluminum surface defect picture;

8. The method for detecting real-time defects on an aluminum surface according to claim 1, wherein the input pictures are uniformly adjusted to have a channel number of 640 x 3.

9. The method of claim 8, wherein the first scale feature map has a high-by-width channel number of 255 x 80, the second scale feature map has a high-by-width channel number of 255 x 40, and the third scale feature map has a high-by-width channel number of 255 x 20.

10. An improved YOLOv 5-based real-time defect detection system for aluminum surfaces, comprising:

at least one processor; and at least one memory communicatively coupled to the processor, wherein:

the memory stores program instructions executable by the processor, the processor invoking the program instructions capable of performing the improved YOLOv 5-based aluminum surface real-time defect detection method of any one of claims 1 to 9.