CN116342536A

CN116342536A - Aluminum strip surface defect detection method, system and equipment based on lightweight model

Info

Publication number: CN116342536A
Application number: CN202310311028.6A
Authority: CN
Inventors: 李毅波; 吕泽华; 黄明辉; 潘晴
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-03-28
Filing date: 2023-03-28
Publication date: 2023-06-27

Abstract

The invention discloses an aluminum strip surface defect detection method, system and equipment based on a lightweight model, and relates to the field of surface defect detection and machine vision. The surface defect segmentation model constructed by the method is a lightweight model composed of a main feature extraction network, a multi-scale feature fusion network, a depth supervision network and a prediction network, so that the number and complexity of model parameters are greatly reduced while high segmentation accuracy is maintained, the surface defect detection efficiency, accuracy and reliability of the aluminum strip are improved, theoretical basis is laid for deployment of an aluminum strip surface defect detection algorithm at a mobile end and embedded equipment, and the method has wide application prospect.

Description

Aluminum strip surface defect detection method, system and equipment based on lightweight model

Technical Field

The invention relates to the field of surface defect detection and machine vision in industrial scenes, in particular to an aluminum strip surface defect detection method, system and equipment based on a lightweight model.

Background

The aluminum strip has good mechanical properties and low cost, and is widely applied to the fields of electronic products, packaging containers, new energy automobiles and the like. However, due to factors such as fault aging, dust, unreasonable process parameters and the like of rolling equipment, various defect types such as holes, black spots, scratches and the like are generated on the surface of the aluminum strip in the production process, the attractiveness of subsequent finished products and semi-finished products are directly influenced, and even the safety and the usability of the products are influenced, so that the detection of the defects on the surface of the aluminum strip becomes an indispensable key ring in actual production. At present, the detection of the surface defects of the aluminum strip on the production line is mainly based on manual naked eye observation, the method is low in efficiency and high in labor cost, the judgment of the defects is seriously influenced by subjective factors of workers, and the detection accuracy cannot be ensured.

In addition, the traditional defect detection method based on machine learning selects and extracts defect characteristics through an image processing mode of manual design, and the method is difficult to extract all the characteristics of a complex surface and has poor generalization capability. In recent years, a Convolutional Neural Network (CNN) based on deep learning solves the problems, and the convolutional neural network gradually extracts complex advanced characteristic information from an image through a plurality of convolutional layers, has strong characteristic learning capability, and provides a new solution for detecting the surface defects of the aluminum strip. The CNN-based target detection network YOLO (YouLook Only Once) series, R-CNN series and the like have been widely applied to the identification and positioning of complex surface defects of industrial products. However, the following problems still need to be solved in the surface defect detection in the current aluminum strip production process: on the one hand, some defects on the surface of the aluminium strip, such as scratches, chromatic aberration, occur in the form of strips of constantly changing angle, in which case the anchor boxes (Bounding boxes) in the YOLO series and in the R-CNN series have a limited effect, which contain large pieces of normal areas without defects; on the other hand, the real-time performance is a necessary requirement for surface defect segmentation in the aluminum strip production process, and on the premise of not influencing the production speed, a visual detection system arranged on the production line is required to complete the real-time segmentation of defects, so that a network is required to have higher reasoning efficiency and smaller model size, and the optimal balance of the segmentation speed, precision and reliability is achieved; on the other hand, because the production environment of the aluminum strip is complex, the surface defects of the aluminum strip have obvious size differences, similarity among classes and intra-class differences, and great challenges are brought to accurate defect boundary segmentation and identification of a network, so that the network is required to extract more abundant characteristic information, learn more comprehensive shallow detail expression and deep semantic expression, and further improve the accuracy of defect segmentation.

Disclosure of Invention

Aiming at the problems in the background art, the invention provides a method, a system and equipment for detecting the surface defects of the aluminum strip based on a lightweight model, so that the calculated amount and the parameter amount of the model are greatly reduced, and the efficiency, the precision and the reliability of detecting the surface defects of the aluminum strip are improved.

In order to achieve the above object, the present invention provides the following solutions:

in one aspect, the invention provides a method for detecting surface defects of an aluminum strip based on a lightweight model, which comprises the following steps:

collecting original image data of the surface of the aluminum strip in real time by using an industrial line scanning camera;

uniformly dividing the original image data into a plurality of square subgraphs, and screening out subgraphs containing defects;

pixel-level labeling is carried out on the screened defect subgraphs, and the contours and the categories of the defects are marked to obtain a surface defect data set; the surface defect data set comprises a training set, a verification set and a test set;

carrying out data enhancement on the surface defect data set to obtain a surface defect data set after data enhancement;

constructing a surface defect segmentation model, wherein the surface defect segmentation model comprises a trunk feature extraction network, a multi-scale feature fusion network, a depth supervision network and a prediction network;

Training, verifying and testing the surface defect segmentation model by utilizing the surface defect data set with the enhanced data to obtain a trained surface defect segmentation model;

and detecting the surface defects in the aluminum strip production process in real time by adopting the trained surface defect segmentation model.

Optionally, the pixel-level labeling is performed on the screened defect subgraph, and the defect outline and the category are marked to obtain a surface defect data set, which specifically comprises:

manually classifying and marking the defects in the screened defect subgraph in a pixel level mode by using LabelMe, marking the defects in different categories by using different colors, and forming a tag data set in a JSON format;

converting the JSON format tag data set into a PNG format tag data set;

randomly dividing 20% of each type of defect subgraph and corresponding PNG format image into test sets, 90% of the rest images into training sets, and the last 10% into verification sets;

and respectively integrating the test set, the training set and the verification set formed by different types of defect subgraphs and the corresponding PNG format images, and manufacturing the integrated data sets into the format the same as the VOC2007 data set, so as to obtain the surface defect data set.

Optionally, the data enhancement is performed on the surface defect data set to obtain a surface defect data set after data enhancement, which specifically includes:

and carrying out data enhancement on the surface defect data set by adopting one or more data enhancement modes of random cutting, random horizontal overturning, random vertical overturning, scale dithering, color dithering or Mosaic to obtain the surface defect data set after data enhancement.

Optionally, the constructing the surface defect segmentation model specifically includes:

constructing a joint loss function of the surface defect segmentation model;

building a trunk feature extraction network based on a MobileViTv2 network and a CBAM attention module;

constructing a multi-scale feature fusion network based on the SPPF module and the HRFPN network;

establishing a depth supervision network based on a convolution layer and the joint loss function;

constructing a prediction network based on the convolution layer;

and connecting the trunk feature extraction network, the multi-scale feature fusion network, the depth supervision network and the prediction network to form the surface defect segmentation model.

Optionally, the training, verifying and testing the surface defect segmentation model by using the surface defect data set after data enhancement to obtain a trained surface defect segmentation model, which specifically includes:

Training a surface defect segmentation model by using the training set after data enhancement, and storing a corresponding weight file in the training process;

checking each weight file on the verification set after data enhancement, adjusting the super parameters of the surface defect segmentation model, and simultaneously selecting the weight file with the highest segmentation accuracy as the model parameters of the trained surface defect segmentation model;

inputting the test set with the enhanced data into a trained surface defect segmentation model to check the approximate generalization performance of the model.

In another aspect, the present invention provides an aluminum strip surface defect detection system based on a lightweight model, comprising:

the original image acquisition module is used for acquiring original image data of the surface of the aluminum strip in real time by using an industrial line scanning camera;

the image clipping and screening module is used for uniformly dividing the original image data into a plurality of square subgraphs and screening the subgraphs containing the defects;

the image marking module is used for carrying out pixel-level marking on the screened defect subgraphs, marking the defect outline and the category, and obtaining a surface defect data set; the surface defect data set comprises a training set, a verification set and a test set;

the data enhancement module is used for carrying out data enhancement on the surface defect data set to obtain a surface defect data set after data enhancement;

The model construction module is used for constructing a surface defect segmentation model and comprises a trunk feature extraction network, a multi-scale feature fusion network, a depth supervision network and a prediction network;

the model training module is used for training, verifying and testing the surface defect segmentation model by utilizing the surface defect data set after the data enhancement to obtain a trained surface defect segmentation model;

and the surface defect detection module is used for detecting the surface defects in the aluminum strip production process in real time by adopting the trained surface defect segmentation model.

On the other hand, the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the aluminum strip surface defect detection method based on the lightweight model when executing the computer program.

In another aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program when executed implementing the method for detecting surface defects of an aluminum strip based on a lightweight model.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

According to the aluminum strip surface defect detection method, system and equipment based on the lightweight model, provided by the invention, the original image data of the aluminum strip surface is acquired in real time by using an industrial line scanning camera; uniformly dividing the original image data into a plurality of square subgraphs, and screening out subgraphs containing defects; pixel-level labeling is carried out on the screened defect subgraphs, and the contours and the categories of the defects are marked to obtain a surface defect data set; the surface defect data set comprises a training set, a verification set and a test set; carrying out data enhancement on the surface defect data set to obtain a surface defect data set after data enhancement; constructing a surface defect segmentation model, wherein the surface defect segmentation model comprises a trunk feature extraction network, a multi-scale feature fusion network, a depth supervision network and a prediction network; training, verifying and testing the surface defect segmentation model by utilizing the surface defect data set with the enhanced data to obtain a trained surface defect segmentation model; and detecting the surface defects in the aluminum strip production process in real time by adopting the trained surface defect segmentation model. The method can greatly reduce the calculated amount and the parameter amount of the model and improve the detection efficiency, the precision and the reliability of the surface defects of the aluminum strip.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for detecting surface defects of an aluminum strip based on a lightweight model;

FIG. 2 is a schematic diagram of the image acquisition, clipping, screening and labeling process in the method of the present invention;

FIG. 3 is a schematic diagram of a surface defect segmentation model according to the present invention;

fig. 4 is a schematic structural diagram of a lightweight feature aggregation node according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention aims to provide an aluminum strip surface defect detection method, system and equipment based on a lightweight model, so as to greatly reduce the calculated amount and parameter amount of the model and improve the detection efficiency, precision and reliability of the aluminum strip surface defect.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

Fig. 1 is a flowchart of an aluminum strip surface defect detection method based on a lightweight model, referring to fig. 1, the method includes:

step 1: and acquiring original image data of the surface of the aluminum strip in real time by using an industrial line scanning camera.

The image acquisition of the surface of the aluminum strip in the aluminum strip production process is to acquire the original image data of the surface of the aluminum strip which moves rapidly in real time by using an industrial line scanning camera and an LED light source, and referring to FIG. 2, the acquired original image pixels are 4096×1024, and the length (4096 pixels) of the acquired original image pixels corresponds to the actual width of the aluminum strip.

Specifically, an industrial line scanning camera and a white LED parallel light source are equipped on a production line. Then, the encoder is adjusted to trigger the camera to shoot the aluminum strip at a frequency which accords with the production line speed, and the shot original picture is stored in the SD card.

Step 2: and uniformly dividing the original image data into a plurality of square subgraphs, and screening out the subgraphs containing the defects.

Step 2, cutting and screening the original image, see fig. 2, uniformly dividing the original image shot in step 1 into a plurality of low-resolution square subgraphs, removing the non-defective subgraphs to reduce the calculated amount of the model, and further screening the subgraphs containing defects as the defect subgraphs in the subsequent step 3.

Step 2, totally screening not less than 600×n sub-graphs (n is the defect class number) comprising defects to establish a data set, wherein the number of images of each class of defects is not less than 600. Including 5 common defects: color differences, cracking, pitting, black spots and scratches. In the screening process, sample equalization processing is carried out on each type of defect picture, so that the number of each type of defect picture is approximately the same.

Step 3: and (3) carrying out pixel-level labeling on the screened defect subgraphs, marking the outline and the category of the defect, and obtaining a surface defect data set.

And 3, carrying out pixel-level labeling on the defect subgraph screened in the step 2, marking the defect outline and the category, and obtaining a surface defect data set, wherein the method specifically comprises the following steps of:

step 3.1: manually classifying and marking the defects in the screened defect subgraph in a pixel level mode by using LabelMe, marking the defects of different categories by using different colors, and forming a defect picture tag data set in a JSON format;

Step 3.2: converting the JSON format tag data set into PNG format tag data set, wherein PNG file names are in one-to-one correspondence with original picture names;

step 3.3: randomly dividing 20% of each type of defect subgraph and corresponding PNG format image into test sets, 90% of the rest images into training sets, and the last 10% into verification sets;

step 3.4: and respectively integrating the test set, the training set and the verification set formed by different types of defect subgraphs and the corresponding PNG format images, and manufacturing the integrated data sets into the format the same as the VOC2007 data set, so as to obtain the surface defect data set.

Thus, the surface defect dataset includes a training set, a validation set, and a test set in the same format as the VOC2007 dataset. The training set is used for training a model and determining parameters, and the number is the largest. The validation set is used to detect the performance of the model and can in turn adjust model parameters, mainly hyper-parameters, based on the test results. The test set is used for detecting the generalization performance of the trained final model, and model parameters are not changed after the test set is passed.

If there are only training and testing sets, the model cannot be evaluated before the testing set. It is generally more desirable in practice to evaluate the model to a certain degree before the final model is determined, then to adjust the model parameters (especially the hyper parameters) according to the evaluation results, and then to retrain the model until the model is basically trained, and finally to test the model again by using the test set. The test is to check the approximate generalization performance of the model, and the model parameters are not updated continuously. It is for this purpose that the present invention divides a validation set. The validation set is also, to some extent, a training set, since the model parameters will be adjusted according to the results of the validation set.

Step 4: and carrying out data enhancement on the surface defect data set to obtain the surface defect data set after data enhancement.

And (3) carrying out data enhancement on the surface defect data set in the step (3) in one or a combination of a plurality of data enhancement modes, wherein the data enhancement modes comprise random cutting, random horizontal overturning, random vertical overturning, scale dithering, color dithering or Mosaic, so as to obtain the surface defect data set after data enhancement. When the data is enhanced, the defect in the image is still matched with the marked boundary after the image is transformed.

Step 5: and constructing a surface defect segmentation model, wherein the surface defect segmentation model comprises a trunk feature extraction network, a multi-scale feature fusion network, a deep supervision network and a prediction network.

The surface defect segmentation model construction process specifically comprises the following steps:

step 5.1: and constructing a joint loss function of the surface defect segmentation model.

The invention adopts the Dice Loss function

And a binary cross entropy loss function->

Structure is connectedLoss function L _Union The expression is as follows:

wherein->

And y respectively represents a segmentation result and a manual annotation of the model prediction; epsilon is a smaller value, used for avoiding the severe fluctuation of the value; n is the number of images per batch (batch size) of the input model, and >

And y _i Respectively representing the model predictive segmentation result and the artificial annotation of the ith frame in each batch of images; alpha and beta are parameters for balancing the ratio of the two loss functions.

Step 5.2: the backbone feature extraction network is built based on the MobileViTv2 network and the CBAM attention module.

The adopted MobileViTv2 network is a lightweight characteristic extraction network, and the improvement of the invention is that the last average pooling layer and full connection layer of the network are removed, and the network consists of ten layers of a 3 x 3 common convolution layer, six mobilenv 2block layers and three MobileViTv2block layers; the ten layers of the MobileViTv2 network are divided into the following five basic blocks (blocks), and the first layer in each basic block performs a downsampling operation. Wherein,,

basic block one: 3×3Conv (stride=2) +mobilenetv2block (stride=1);

basic block two: mobilenet v2block (stride=2) +mobilenet v2block (stride=1);

basic block three: mobilenet v2block (stride=2) +mobilevitv2 block;

basic block four: mobilenet v2block (stride=2) +mobilevitv2 block;

basic block five: mobilenet v2block (stride=2) +mobilevitv2 block.

The above five basic blocks are directly connected, i.e. the output characteristics of one basic block are taken as the input of the next basic block. Another improvement of the present invention is that a CBAM attention module is embedded after each basic block, and the output characteristics of each basic block are processed by the CBAM attention module and then input into the next basic block. Meanwhile, five different scale features output after CBAM processing are directly input into the multi-scale feature fusion network, as shown in FIG. 3, and five features are firstly input into five different SPPF modules in the multi-scale feature fusion network.

The invention embeds a CBAM attention module after each basic block of the MobileViTv2 network, the whole network is named CBAM-MobileViTv2 as the backbone feature extraction network of the invention. Specifically, the CBAM attention module is composed of a channel attention module and a spatial attention module, wherein the channel attention module is composed of an average pooling layer and a maximum pooling layer of a spatial dimension, a multi-layer perceptron (multi-layer preference), and a sigmoid activation function, and the spatial attention module is composed of the average pooling layer and the maximum pooling layer of the channel dimension, and the sigmoid activation function.

The improved CBAM-MobileViTv2 network correspondingly outputs five-scale feature graphs through five downsampling, which are P respectively ₁ 、P ₂ 、P ₃ 、P ₄ 、P ₅ As shown in fig. 3, the dimensions are 256×256×48, 128×128×96, 64×64×192, 32×32×288, 16×16×384, respectively.

Step 5.3: and constructing a multi-scale feature fusion network based on the SPPF module and the HRFPN network.

The multi-scale feature fusion network of the invention is composed of SPPF modules and HRFPN networks. The SPPF module is formed by serially connecting a convolution layer with a step length of 1 and a convolution kernel of 1 multiplied by 1 and three 5 multiplied by 5 largest pooling layers, the output characteristics of each layer are fused through splicing (connection) operation, and the channel number is adjusted through the other 1 multiplied by 1 convolution layer. Wherein each 1 x 1 convolutional layer is followed by a batch regularization (Batch normalization) layer and a SiLU layer.

The HRFPN network consists of four gradually shallower branches from top to bottom, the number of paths is gradually increased in width, and along with the increase of the number of paths, the nodes in the depth direction of each path are gradually reduced.

The HRFPN network uses lightweight feature aggregation nodes, as shown in fig. 4, each node has two or three sides to input feature maps (including low-level feature maps, current feature maps, high-level feature maps) with different resolutions, and the feature maps are adjusted to the same resolution after downsampling or upsampling operation. These feature maps are then integrated by Element-wise addition (Element-wise summation) and feature fusion by MobileNetv2 block with a step size of 1 (stride=1). In FIG. 4, W, H respectively represent the width and height of the feature map, different subscripts respectively correspond to the different feature maps, and subscripts l, c, h respectively correspond to the low-level feature map, the current feature map, the high-level feature map, e.g., W _c And H _c The width and height of the current feature map are represented; c represents the number of channels of the feature map, and the number of channels of the feature map in each node is the same.

The HRFPN network uses fast normalized fusion weighting characteristics to give different weights to input characteristics of different scales, and the method is expressed as

The contributions of the input features of different resolutions to the output features are not equal, so it is necessary to apply additional weights to each input feature to represent its importance, where w _i For a learnable multidimensional tensor weight, e=0.0001 is used to avoid instability of numerical fluctuations, I _i The input feature map is represented, the normalized weight value is between 0 and 1, and O represents the feature map after the normalized weight is used.

Feature map P output by trunk feature extraction network ₁ 、P ₂ 、P ₃ 、P ₄ 、P ₅ After being processed by SPPF modules, the images are sent to HRFPN network to output four feature images Q with the same resolution ₁ 、Q ₂ 、Q ₃ 、Q ₄ 。

Step 5.4: and establishing a depth supervision network based on the convolution layer and the joint loss function.

Referring to fig. 3, a feature map Q that outputs a multi-scale feature fusion network ₁ 、Q ₂ 、Q ₃ 、Q ₄ The double up-sampling is performed so that the resolution matches the input image, and the number of channels of the feature map is adjusted to the defect type number n+1 (background) by a convolution layer (Conv) with a step size of 1 and a convolution kernel of 1×1. The joint loss function L constructed by using the step 5.1 _Union For the four feature graphs respectively output

And carrying out Loss calculation with the manual label y to obtain four combined Loss function calculation results Loss 1-4. The loss calculation is carried out on the feature graphs with different resolutions, so that the detail information with high resolution and the semantic abstract information with low resolution can be fully utilized, the information flow in the model is promoted, and the network optimization process is accelerated.

Step 5.5: a predictive network is constructed based on the convolutional layers.

As shown in fig. 3, the four feature maps output by the depth supervision network are subjected to a splicing operation, and the number of channels of the obtained feature map is adjusted to n+1 (background) through a convolution layer with a step length of 1 and a convolution kernel of 1×1. This feature map is not only used as the output of the final prediction network

And the Loss calculation is also carried out in the same way, so that a joint Loss function calculation result Loss5 is obtained. And the loss calculation of the finally fused feature images is mutually assisted with the loss calculation of different feature images in the above deep supervision network, so that multi-scale fusion information can be fully utilized, and the accuracy of the model in dividing defects is improved.

Step 5.6: and connecting the trunk feature extraction network, the multi-scale feature fusion network, the depth supervision network and the prediction network to form the surface defect segmentation model.

Therefore, the surface defect segmentation model constructed by the invention adopts CBAM-MobileViTv2 to extract defect characteristics, adopts an SPPF module and an HRFPN network to carry out multi-scale characteristic fusion, adopts a deep supervision network to accelerate the convergence process of the network, adopts a prediction network to output final defect prediction results, realizes a lightweight model which can segment n different types of defects, has high segmentation precision, strong generalization capability, small network model and high segmentation speed, can be rapidly deployed on embedded equipment, and meets the requirement of real-time segmentation of defects on an aluminum strip production line.

Step 6: and training, verifying and testing the surface defect segmentation model by using the surface defect data set with the enhanced data to obtain a trained surface defect segmentation model.

The training, verifying and testing process of the surface defect segmentation model specifically comprises the following steps:

step 6.1: and training a surface defect segmentation model by using the training set after data enhancement, and storing a corresponding weight file in the training process.

The corresponding weight file in the training process refers to the weight file of model parameters generated by continuous iterative computation in the training process in order to minimize the loss function between the real label and the predicted result. Therefore, the weight file, namely the parameters of the segmentation model obtained by training, generates a plurality of weight files along with the progress of iterative computation in the training process, the loss function is continuously reduced, and the obtained weight files are more and more accurate for predicting the defects.

Step 6.2: and respectively loading the weight files into the surface defect segmentation model, checking on a verification set after data enhancement, adjusting the super parameters of the surface defect segmentation model, and simultaneously selecting the weight file with the highest segmentation accuracy as the model parameters of the trained surface defect segmentation model.

Segmentation accuracy is typically characterized by an average intersection ratio (mIoU, mean intersection over union), the average of the ratio of all class intersections and union, as follows:

wherein p is _ij Representing pixels predicting i-class as j-classQuantity, p _ji Representing the number of pixels predicting j class as i class, p _ii Representing the number of pixels that correctly predicts the i-class as the i-class, and k represents the number of defect classes.

Step 6.3: inputting the test set with the enhanced data into a trained surface defect segmentation model to check the approximate generalization performance of the model.

Step 7: and detecting the surface defects in the aluminum strip production process in real time by adopting the trained surface defect segmentation model.

When the surface defect segmentation prediction is carried out, the defects types, the defects positions and the boundary conditions (defect outlines) of the defects can be output only by inputting the picture to be detected into a trained surface defect segmentation model (namely, a model loaded with a weight file with the highest segmentation accuracy), so that the real-time detection of the surface defects in the aluminum strip production process is realized.

Compared with the detection method in the prior art, the method has the following advantages: according to the lightweight surface defect segmentation model constructed by the invention, an improved MobileViTv2 is used as a feature extraction network, and a CBAM attention mechanism is embedded behind each basic block of the MobileViTv2 to enrich feature information extracted by the network; the SPPF module is used in the multi-scale feature fusion network, so that the connection of local and global information in different scale feature graphs is enhanced, the HRFPN network is used for fully fusing shallow detail information on the basis of retaining high-level semantic information, and the defects with large shape and size difference and complex types can be accurately segmented; in a deep supervision network, carrying out loss function calculation on the output feature graphs of the five branches, so that the network is more fully trained and the convergence process of the network is accelerated; besides outputting defect type information and position information, the segmentation result also outputs accurate outlines and boundaries of defects, so that the area and density of the defects are calculated, the enterprise is helped to obtain more detailed defect information, the quality of the aluminum strip is judged more accurately, and the application prospect is quite wide.

The following provides a specific embodiment of the aluminum strip surface defect detection method based on the lightweight model, which comprises the following steps:

s1: collecting an image of the surface of the aluminum strip, and collecting original image data of the surface of the aluminum strip which moves rapidly in real time by using an industrial line scanning camera and an LED light source, wherein the pixels of the photographed original image are 4096 multiplied by 1024;

when an image is acquired, an industrial line scanning camera and a white LED parallel light source are arranged on a production line, the moving speed of the aluminum strip on the production line is 50m/min, an encoder is regulated to trigger the camera to shoot the aluminum strip at a frequency which accords with the speed of the production line, and the shot original photo is stored in an SD card.

S2: cutting and screening the image, namely uniformly dividing the photographed original image in the step S1 into 16 square subgraphs with low resolution, wherein the resolution of each subgraph is 512 multiplied by 512, and screening the subgraphs containing defects from the square subgraphs; totally 3852 subgraphs including defects were screened out to build a dataset including 5 common defects: and (3) carrying out sample equalization treatment on each type of defect pictures in the selection process to ensure that the number of each type of defect pictures is approximately the same.

S3: image marking, namely performing pixel-level marking on the defect subgraph screened in the step S2, marking the defect outline and the category, and obtaining a surface defect data set;

manually classifying defects in an image and marking pixel-level outlines by using LabelMe, marking the defects of different categories by using different colors to form a defect picture tag dataset in a JSON format, converting the tag dataset in the JSON format into a tag dataset in a PNG format, wherein PNG files are in one-to-one correspondence with original picture names when named; 20% of the corresponding PNG pictures of each type of defect picture machine are randomly divided into test sets, 90% of the residual images are divided into training sets, and the last 10% are verification sets. And integrating the data of different types of defects to form a final test set, a training set and a verification set, wherein the specific number of the data is 2768 sheets as the training set, 312 sheets as the verification set and 772 sheets as the test set, and manufacturing the data set into the same format as the VOC2007 data set to obtain the surface defect data set.

S4: and (3) data enhancement, wherein the surface defect data set is the basis of defect detection, and the data enhancement is carried out on the surface defect data set in S3 in consideration of fewer defect data in industrial practice. The data enhancement technology mainly comprises random clipping, random horizontal overturning, random vertical overturning, scale dithering, color dithering or Mosaic and the like. And when the data is enhanced, the defect in the image is still matched with the marked boundary after the image is transformed.

S5: constructing a surface defect segmentation model, wherein the model adopts CBAM-MobileVitv2 to extract defect characteristics, an SPPF (specific parameter pattern) module and an HRFPN (fast Fourier transform network) network to perform multi-scale characteristic fusion, a deep supervision network accelerates the convergence process of the network, and a prediction network outputs a final defect prediction result; wherein:

s5.1: constructing a Loss function of the segmentation network by using a Dice Loss function

And a binary cross entropy loss function->

Constructing a joint loss function L _Union The expression is as follows: />

Wherein->

And y respectively represents a segmentation result and a manual annotation of the model prediction; epsilon is a smaller value, used for avoiding the severe fluctuation of the value; n is the number of images per batch (batch size) of the input model, and>

and y _i Respectively representing the model predictive segmentation result and the artificial annotation of the ith frame in each batch of images; alpha and beta are parameters for balancing the ratio of the two loss functions, both set to 0.5.

S5.2: building a trunk feature extraction network based on a MobileViTv2 network and a CBAM attention module;

the MobileVitv2 network is a lightweight characteristic extraction network, and the invention removes the last average pooling layer and the full connection layer of the network, and consists of ten layers of a 3X 3 common convolution layer, six MobileNet v2block layers and three MobileVitv2block layers; the ten layers of the MobileViTv2 network are divided into the following five basic blocks, and the first layer in each basic block performs downsampling operation; wherein,,

Basic block one: 3×3Conv (stride=2) +mobilenetv2block (stride=1);

basic block two: mobilenet v2block (stride=2) +mobilenet v2block (stride=1);

basic block three: mobilenet v2block (stride=2) +mobilevitv2 block;

basic block four: mobilenet v2block (stride=2) +mobilevitv2 block;

basic block five: mobilenet v2block (stride=2) +mobilevitv2 block;

each basic block of the MobileViTv2 network is followed by a CBAM attention module (Convolutional block attention module), the entire network being named CBAM-MobileViTv2. The CBAM attention mechanism comprises a channel attention module and a space attention module, wherein the channel attention module comprises an average pooling layer and a maximum pooling layer of a space dimension, a multi-layer persistence and a sigmoid activation function, and the space attention module comprises the average pooling layer and the maximum pooling layer of the channel dimension and the sigmoid activation function. The improved CBAM-MobileViTv2 network correspondingly outputs five-scale feature graphs through five downsampling, wherein the feature graphs are respectively P ₁ 、P ₂ 、P ₃ 、P ₄ 、P ₅ 。

S5.3: constructing a multi-scale feature fusion network based on the SPPF module and the HRFPN network;

the multi-scale feature fusion network is composed of SPPF modules (Spatial Pyramid Pooling Fast) and HRFPN networks (High Resolution Feature Pyramid Network).

The SPPF module is formed by serially connecting a convolution layer with a step length of 1 and a convolution kernel of 1 multiplied by 1 and three 5 multiplied by 5 largest pooling layers, the output characteristics of each layer are fused through splicing (connection) operation, and the channel number is adjusted through the other 1 multiplied by 1 convolution layer. Wherein each 1 x 1 convolutional layer is followed by a batch regularization (Batch normalization) layer and a SiLU layer.

The HRFPN network consists of four progressively shallower top-down branches. The network gradually increases the number of paths in width, and as the number of paths increases, the nodes in the depth direction of each path gradually decrease.

The HRFPN network uses lightweight feature aggregation nodes as shown in fig. 3. Each node has two or three edges to input feature images with different resolutions, the low-level feature images are downsampled by Depthwise convolution with a step length of 2, the high-level feature images are upsampled by nearest neighbor interpolation, and the different feature images are adjusted to the same resolution. Then, the feature graphs are integrated through Element-wise addition (Element-wise summation), and feature fusion is carried out through a MobileNetv2 block with a step length of 1;

Wherein w is _i For a learnable multidimensional tensor weight, e=0.0001 is used to avoid instability of numerical fluctuations, I _i The input feature map is represented, the normalized weight value is between 0 and 1, and O represents the feature map after the normalized weight is used.

Feature map P output by trunk feature extraction network ₁ 、P ₂ 、P ₃ 、P ₄ 、P ₅ After being processed by the corresponding SPPF modules, the four feature images Q with the same resolution are sent to the HRFPN network to be output ₁ 、Q ₂ 、Q ₃ 、Q ₄ 。

S5.4: setting up a deep supervision network;

the characteristic diagram Q ₁ 、Q ₂ 、Q ₃ 、Q ₄ Double up-sampling to make its resolution identical to that of input image, and features by convolution layer with step length of 1 and convolution kernel of 1×1The number of channels was adjusted to 6 (5 defects + background). Joint loss function L constructed by using S5.1 _Union And respectively carrying out loss calculation on the output four feature graphs and the manual annotation.

S5.5: constructing a prediction network;

and (3) performing splicing operation on the four feature graphs output by the depth supervision network, and adjusting the channel number of the obtained feature graphs to 6 (5 defects and background) through a convolution layer with a step length of 1 and a convolution kernel of 1 multiplied by 1. This signature serves not only as the output of the final prediction network, but also as a loss calculation.

S6: training a surface defect segmentation model, storing corresponding weight files in the training process, checking each weight file on a verification set so as to adjust the super parameters of the model, selecting the weight with the highest segmentation accuracy as the last trained model parameter, and inputting a test set picture into the final model to check the approximate generalization performance of the model;

in S6, the network training comprehensively utilizes the transfer learning and the network fine tuning, and the pre-training weight is loaded before the network starts training. The whole training process is divided into 200 rounds in two stages, the first 50 rounds, in which the backbone feature extraction network is frozen, and the number of images per batch (batch size) is set to 8. The second stage was 150 rounds later, unfrozen, batch size set to 4. The Warmup strategy was used in the first 5 training rounds, followed by the cosine learning plan. Using Adam optimizer, the initial learning rate is set to 5e ^-4 Exponential decay factor beta ₁ Set to 0.9, beta ₂ Set to 0.999.

S7: and (3) surface defect segmentation prediction, namely inputting a picture to be detected into a model, loading the model into the trained weight in the S6, and finally outputting defect types, defect positions and boundary conditions of the defect types and the defect positions.

Based on the method provided by the invention, the invention also provides an aluminum strip surface defect detection system based on a lightweight model, which comprises the following steps:

Further, the present invention also provides an electronic device, which may include: a processor, a communication interface, a memory, and a communication bus. The processor, the communication interface and the memory complete communication with each other through a communication bus. The processor may call a computer program in the memory to perform the lightweight model-based aluminum strip surface defect detection method.

Furthermore, the computer program in the above-described memory may be stored in a computer-readable storage medium when it is implemented in the form of a software functional unit and sold or used as a separate product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a mobile hard disk, a read-only memory, a random access memory, a magnetic disk or an optical disk.

Further, the invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program being executed to implement the method for detecting surface defects of aluminum strips based on a lightweight model.

The invention discloses a method, a system and equipment for detecting surface defects of an aluminum strip based on a lightweight model, which relate to the field of industrial computers in whole and mainly comprise the steps of collecting images of the surface of the aluminum strip; cutting and screening images; image marking, namely performing pixel-level segmentation marking on the defects, giving category labels to the defects to obtain a surface defect data set, and dividing a training set, a verification set and a test set; enhancing data; the surface defect segmentation model is constructed and consists of a trunk feature extraction network, a multi-scale feature fusion network, a depth supervision network and a prediction network; training a surface defect segmentation model, storing corresponding weight files, testing each weight file on a verification set, selecting the round with the highest segmentation accuracy as the final model weight, and detecting the generalization performance of the model on the test set; and (3) surface defect segmentation prediction, namely inputting a picture to be detected into a model, loading trained weights, and finally outputting defect types, defect positions and boundary conditions of the defect types and the defect positions. The invention greatly reduces the quantity and complexity of model parameters while maintaining high segmentation precision, lays a theoretical foundation for the deployment of an aluminum strip defect segmentation algorithm at a mobile end and embedded equipment, and has wide application prospect.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The aluminum strip surface defect detection method based on the lightweight model is characterized by comprising the following steps of:

2. The method for detecting the surface defects of the aluminum strip based on the lightweight model according to claim 1, wherein the pixel-level labeling is performed on the screened defect subgraphs, the defect contours and the types are marked, and a surface defect data set is obtained, and the method specifically comprises the following steps:

converting the JSON format tag data set into a PNG format tag data set;

3. The method for detecting surface defects of aluminum strip based on a lightweight model according to claim 1, wherein the step of data enhancing the surface defect data set to obtain the data enhanced surface defect data set comprises the following steps:

4. The method for detecting surface defects of aluminum strips based on lightweight models according to claim 1, wherein the construction of the surface defect segmentation model specifically comprises:

constructing a joint loss function of the surface defect segmentation model;

constructing a prediction network based on the convolution layer;

5. The method for detecting surface defects of aluminum strip based on a lightweight model according to claim 1, wherein the training, verifying and testing of the surface defect segmentation model by using the data-enhanced surface defect dataset is performed to obtain a trained surface defect segmentation model, and specifically comprises the following steps:

6. An aluminum strip surface defect detection system based on a lightweight model, which is characterized by comprising:

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the lightweight model-based aluminum strip surface defect detection method as claimed in any one of claims 1 to 5 when the computer program is executed.

8. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed implements the lightweight model-based aluminum strip surface defect detection method as claimed in any one of claims 1 to 5.