CN116777842A - Light texture surface defect detection method and system based on deep learning - Google Patents

Light texture surface defect detection method and system based on deep learning Download PDF

Info

Publication number
CN116777842A
CN116777842A CN202310591633.3A CN202310591633A CN116777842A CN 116777842 A CN116777842 A CN 116777842A CN 202310591633 A CN202310591633 A CN 202310591633A CN 116777842 A CN116777842 A CN 116777842A
Authority
CN
China
Prior art keywords
module
loss
texture surface
layer
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310591633.3A
Other languages
Chinese (zh)
Inventor
金�一
鲁浩然
王旭
王涛
李浥东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN202310591633.3A priority Critical patent/CN116777842A/en
Publication of CN116777842A publication Critical patent/CN116777842A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a method and a system for detecting light texture surface defects based on deep learning, wherein the method is divided into training and testing stages. The training stage is based on texture surface images of an input training set, the texture surface images are forward propagated layer by layer to obtain a prediction frame of the defect characteristics, the prediction frame of the defect characteristics is obtained, then loss between the prediction frame of the defect characteristics and a real frame of a target image is calculated, reverse propagation is carried out by using the loss, model weights are updated, and the process is repeated until the set iteration round number epoch is reached. And in the test stage, loading data of a test set, outputting the category and the position of the defect image through a trained model, performing evaluation index calculation, judging the performance of the model according to the index, returning to a training link again if the expected requirement cannot be met, performing further adjustment training, and storing model weights if the expected performance is reached, so that the flow of the whole technical invention is completed, and a final solution is obtained.

Description

Light texture surface defect detection method and system based on deep learning
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a system for detecting defects of a lightweight texture surface based on deep learning.
Background
In recent years, artificial intelligence develops heat and hardware equipment is updated fast in iteration, the deep learning technology develops well, the target detection task is widely applied in the deep learning development, and the target detection task is one of tasks for researching the most heat in the deep learning technology. The object detection task based on the deep neural network has prominent advantages in aspects of face recognition, image segmentation, pedestrian re-recognition, industrial detection and the like. The deep learning technology is continuously updated, and the innovation of the target detection task is promoted to be continuously emerging. The target detection algorithm based on deep learning is divided into two types according to the detection steps, one type is a two-stage target detection algorithm, and the detection accuracy of the two-stage target detection algorithm is high, but the detection speed is low; the other type is a one-stage target detection algorithm, and the detection accuracy of the algorithm is lower than that of a two-stage algorithm, but the detection speed is high. The surface defect detection task is a branch of the target detection task, and in the industrial field, the surface defect detection is one of important links of product production. The traditional method is time-consuming and labor-consuming, and the quality of the detection effect is seriously dependent on the human objective factors of experts. The deep learning technology overcomes the defects of the traditional method, has good detection effect and strong universality, but still has some problems, has small data volume in the industrial field, has slow detection timeliness, needs to be improved in detection precision and the like. The method aims to improve the detection accuracy of the target detection network model, the target detection network model is more and more complex, the depth is also deeper and deeper, the required computational power resource is large, the calculation amount of the model is large, the detection timeliness is low, and the real-time requirement of defect detection cannot be met in edge equipment or embedded equipment with limited computational power. In order to meet the real-time requirements of defect detection in edge equipment or embedded devices, a lightweight design of the model is required.
The light weight of the network model is to ensure the accuracy of the model, simplify the network, reduce the network parameters and increase the calculation speed. At present, there are many research methods for lightening networks, and the lightening methods can be roughly classified into one type of compression operation on a model and the other type of lightening design on a network. The light weight method specifically comprises light weight network structure, knowledge distillation, network pruning, quantization and the like.
Disclosure of Invention
The embodiment of the invention provides a light texture surface defect detection method and system based on deep learning, which are used for solving the problems existing in the prior art.
In order to achieve the above purpose, the present invention adopts the following technical scheme.
The lightweight texture surface defect detection method based on deep learning comprises the following steps:
s1, obtaining a prediction frame with defect characteristics through layer-by-layer convolution forward propagation processing of a network model provided with a SheffeNetv 2 lightweight network based on a texture surface image of a training set;
s2, obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to a network model to update model parameters;
s3, repeatedly executing the steps S1 and S2 until the preset iteration times are reached, and obtaining a texture surface defect image;
s4, testing and evaluating the texture surface defect image, if the evaluation result does not reach the preset requirement, modifying the super parameters of the network model, returning to the execution steps S1 to S3, and otherwise, outputting the texture surface defect image.
Preferably, the network model comprises a backbone network, a neck and an output part which are sequentially arranged along the data flow direction;
the backbone network has a 3x3 convolutional layer, a normalization layer, a ReLu active layer, and a max pooling layer sequentially arranged in a data flow direction, and a first shufflenet 2 layer composed of one Shuffle block of step size 2, a second shufflenet 2 layer composed of three Shuffle blocks of step size 1, a third shufflenet 2 layer composed of one Shuffle block of step size 2, a fourth shufflenet 2 layer composed of seven Shuffle blocks of step size 1, a fifth shufflenet 2 layer composed of one Shuffle block of step size 2, a sixth shufflenet 2 layer composed of three Shuffle blocks of step size 1;
each Shuffle block includes a Shortcut branch and a deep convolution branch; respectively carrying out feature extraction operation on the short circuit branch and the deep convolution branch when the step length is 1, then recombining the extracted feature information through channel re-washing, and merging the short circuit branch and the deep convolution branch when the step length is 2;
the neck portion includes:
the first part is formed by overlapping a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the first part is used for splicing the backbone network;
the second part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the second part is used for splicing the backbone network;
the third part consists of a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost superposition; the splicing module of the third part is used for splicing the GhostConv module of the second part;
the fourth part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost module; the splicing module of the fourth part is used for splicing the GhostConv module of the first part;
the CA_H attention mechanism module is provided with an H-sigmoid activation function;
the output part is provided with three GhostConv modules which are respectively connected with the second part, the third part and the fourth part of the neck part.
Preferably, each GhostConv module is provided with a standard convolution sub-module and a channel-by-channel convolution sub-module which are sequentially arranged along the data flow direction, and the output result of the GhostConv module is the output result of channel combination of the convolution result of the standard convolution sub-module and the convolution result of the channel-by-channel convolution sub-module.
Preferably, calculating the loss between the predicted frame with the defect feature and the target image real frame is achieved by a SIoU regression loss function including angle loss, distance loss, shape loss, and IoU loss;
the angle loss is calculated by the following formula
In the method, in the process of the invention,and->Representing the central coordinate value of the real frame b cx And b cy A central coordinate value representing a prediction frame;
distance loss passing type
Calculating; wherein, c h And c w Is defined as the height and width of the smallest rectangle that encloses both anchor frames;
shape loss pass-through type
Calculating; where w and h are defined as the width and height of the model output bounding box, w gt And h gt Is defined as the width and height of the actual frame of the object, θ is a variable factor representing the weight of the shape loss;
IoU loss through type
Calculating; wherein A and B respectively represent two rectangular frames;
based on the formulas (1) to (5), SIoU regression loss function formulas are obtained
In a second aspect, the invention provides a lightweight texture surface defect detection system based on deep learning, comprising a training module and a testing module;
the training module has a training set, and is further configured to:
based on a texture surface image of a training set, obtaining a prediction frame with defect characteristics through layer-by-layer convolution forward propagation processing of a network model provided with a shufflenet 2 lightweight network;
obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to a network model to update model parameters;
repeatedly executing the process until the preset iteration times are reached, and obtaining a texture surface defect image;
the test module adds the texture surface defect image output by the training module into a test set of the test module, and the test module is also used for: and (3) testing and evaluating the texture surface defect image, if the evaluation result does not reach the preset requirement, modifying the super parameters of the network model, returning to the execution process of the execution training module, and otherwise, outputting the texture surface defect image.
According to the technical scheme provided by the embodiment of the invention, the invention provides a method and a system for detecting the defects of the lightweight texture surface based on deep learning, wherein the method is divided into training and testing stages. The training stage is based on texture surface images of an input training set, the texture surface images are forward propagated layer by layer to obtain a prediction frame of the defect characteristics, the prediction frame of the defect characteristics is obtained, then loss between the prediction frame of the defect characteristics and a real frame of a target image is calculated, reverse propagation is carried out by using the loss, model weights are updated, and the process is repeated until the set iteration round number epoch is reached. And in the test stage, loading data of a test set, outputting the category and the position of the defect image through a trained model, performing evaluation index calculation, judging the performance of the model according to the index, returning to a training link again if the expected requirement cannot be met, performing further adjustment training, and storing model weights if the expected performance is reached, so that the flow of the whole technical invention is completed, and a final solution is obtained. The method and the system provided by the invention have the following advantages:
1. compared with the existing lightweight obstacle detection model method for improving the YOLOv5s detection network, the method has the advantages of good detection effect and 97.9% precision.
2. The invention utilizes two lightweight networks, an attention mechanism, a lightweight upsampling operator and a SIoU loss function to ensure that the model parameter quantity is small, and the whole model is only 0.62MB.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a prior art detection model architecture;
FIG. 2 is a process flow diagram of a lightweight texture surface defect detection method based on deep learning provided by the invention;
FIG. 3 is a schematic diagram of a network model of the deep learning-based lightweight texture surface defect detection method according to the present invention;
FIG. 4 is a schematic diagram of a SheffleNetv 2 convolution block of the deep learning-based lightweight texture surface defect detection method of the present invention;
FIG. 5 is a schematic diagram of a CA attention mechanism module of the deep learning-based lightweight texture surface defect detection method according to the present invention;
FIG. 6 is a schematic diagram of a CA_H attention mechanism module of the deep learning based lightweight texture surface defect detection method according to the present invention;
fig. 7 is a schematic diagram of a GhostConv module of the deep learning-based lightweight texture surface defect detection method provided by the invention;
FIG. 8 is a flow chart of a preferred embodiment of a deep learning based lightweight texture surface defect detection method provided by the present invention;
FIG. 9 is a logical block diagram of a deep learning based lightweight texture surface defect detection system provided by the present invention;
fig. 10 is a schematic diagram of two rectangular frames for calculating IoU loss in the deep learning-based lightweight texture surface defect detection method according to the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For the purpose of facilitating an understanding of the embodiments of the invention, reference will now be made to the drawings of several specific embodiments illustrated in the drawings and in no way should be taken to limit the embodiments of the invention.
The invention provides a lightweight texture surface defect detection method and system based on re-washing and ghost features, which are used for solving the following problems in the prior art:
currently, most of product part defect detection in the industry is completed manually, and this method has high labor cost, low detection rate and low efficiency, for example: PCB surface defect detection, steel plate defect detection, metal shaft detection and the like. With the development of the deep learning technology, the application of the deep learning technology in the industry is a hot field, machines are effectively used for replacing manpower, and the Chinese manufacturing is changed into the Chinese 'intelligent' manufacturing. The deep learning technology overcomes the defects of the traditional method, has good detection effect and strong universality, but still has some problems, has small data volume in the industrial field, has slow detection timeliness, needs to be improved in detection precision, and is difficult to meet the requirements of real-time detection in industrial production. The method aims to improve the detection accuracy of the target detection network model, the target detection network model is more and more complex, the depth is also deeper and deeper, the required computational power resource is large, the calculation amount of the model is large, the detection timeliness is low, and the real-time requirement of defect detection cannot be met in edge equipment or embedded equipment with limited computational power. In order to meet the real-time requirement of defect detection in edge equipment or embedded equipment, a one-stage detection algorithm YOLOv5 is used as a reference model for light-weight design, and the purpose of reducing the complexity and parameter quantity of a network structure while maintaining high detection accuracy is achieved.
The YOLOv5 of the mainstream single-stage detection network has good performance in detection speed and detection precision, and has been widely concerned in practical engineering application. Aiming at the defects of poor real-time performance and low detection precision of a traditional train track obstacle detection method, a lightweight obstacle detection model YOLOv5s-MGCT for improving a YOLOv5s detection network is provided, the detection speed of the model is greatly improved, the detection precision is improved, and a good detection effect is achieved in practical application.
As shown in fig. 1, in the approximation technical scheme, a lighter Mixup data enhancement mode is introduced to replace the original Mosaic data enhancement mode in the algorithm; the depth separable convolution GhostConv in the GhostNet network structure is introduced to replace a common convolution layer in a feature extraction network and a feature fusion network in the original YOLOv5s model, so that the calculation cost of the model is reduced; adding a CA space attention mechanism at the tail end of the model feature extraction network, so that the loss of important position information is reduced in the training process of an algorithm, and the loss of detection precision caused by improving GhostNet is compensated; and (3) performing sparse training and channel pruning operation on the improved model, pruning a channel with little influence on detection precision, and simultaneously retaining important characteristic information, so that the model is lighter. Meanwhile, compared with the current mainstream detection algorithm, the method has certain advantages in detection precision and detection speed, and is suitable for detecting the obstacle target in the complex track traffic environment.
In addition, the existing YOLOv5s-MGCT network structure introduces a Ghostnet lightweight network, but the overall model parameter amount of 4.7MB needs to be further reduced. The detection precision mAP0.5 of the algorithm model is 94.7%, the precision requirement in industry is very high, and the value needs to be further improved.
Referring to fig. 2, the invention provides a lightweight texture surface defect detection method based on deep learning, which comprises the following steps:
s1, obtaining a prediction frame with defect characteristics through layer-by-layer convolution forward propagation processing of a network model provided with a SheffeNetv 2 lightweight network based on a texture surface image of a training set;
s2, obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to the network model to update model parameters;
s3, repeatedly executing the steps S1 and S2 until the preset iteration times are reached, and obtaining a texture surface defect image;
s4, testing and evaluating the texture surface defect image, if the evaluation result does not meet the preset requirement, modifying the super parameters of the network model, returning to the steps S1 to S3, and otherwise, outputting the texture surface defect image.
The invention aims to meet the real-time requirement of defect detection in edge equipment or embedded equipment, uses a one-stage detection algorithm YOLOv5 as a reference model to carry out lightweight innovation, and aims to reduce the complexity and parameter quantity of a network structure while maintaining high detection accuracy.
In the preferred embodiment of the present invention, the basic flow of the texture surface defect detection method is shown in fig. 2, firstly, in the model training stage, the texture surface image of the training set is input, and forward propagation is performed through layer-by-layer convolution to obtain a prediction frame of defect characteristics, so as to obtain the prediction frame of defect characteristics, then the loss between the prediction frame of defect characteristics and the real frame of the target image is calculated, the back propagation is performed by using the loss, the model weight is updated, and the process is repeated until the set iteration round number epoch is reached. And in the test stage, loading data of a test set, outputting the category and the position of the defect image through a trained model, performing evaluation index calculation, judging the performance of the model according to the index, returning to a training link again if the expected requirement cannot be met, performing further adjustment training, and storing model weights if the expected performance is reached, so that the flow of the whole technical invention is completed, and a final solution is obtained.
The invention provides a new model algorithm, namely a lightweight texture surface defect detection method based on re-washing and ghost features, as shown in fig. 3, wherein an improved network model is provided, and the improved network model comprises a backbone network, a neck part and an output part which are sequentially arranged along the data flow direction.
The backbone network has a shufflenet 2 lightweight network for feature extraction, which specifically includes a 3x3 convolutional layer, a normalization layer, a ReLu active layer, and a max pooling layer sequentially arranged in a data flow direction, and a first shufflenet 2 layer composed of one Shuffle block of step size 2, a second shufflenet 2 layer composed of three Shuffle blocks of step size 1, a third shufflenet 2 layer composed of one Shuffle block of step size 2, a fourth shufflenet 2 layer composed of seven Shuffle blocks of step size 1, a fifth shufflenet 2 layer composed of one Shuffle block of step size 2, and a sixth shufflenet 2 layer composed of three Shuffle blocks of step size 1.
Each Shuffle block includes a Shortcut branch and a deep convolution branch; and when the step length is 1, respectively carrying out feature extraction operation on the short circuit branch and the deep convolution branch, then recombining the extracted feature information through channel re-washing, and when the step length is 2, merging the short circuit branch and the deep convolution branch.
The neck portion includes:
the first part is formed by overlapping a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the first part is used for splicing a fourth shufflenet v2 layer of the backbone network;
the second part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the second part is used for splicing a second shufflenet v2 layer of the backbone network;
the third part consists of a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost superposition; the splicing module of the third part is used for splicing the GhostConv module of the second part;
the fourth part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost module; the splicing module of the fourth part is used for splicing the GhostConv module of the first part;
the CA_H attention mechanism module is provided with an H-sigmoid activation function;
the output part is provided with three GhostConv modules which are respectively connected with the second part, the third part and the fourth part of the neck part.
Compared with the basic flow and the prior art, the method mainly has 5 innovations: the first point is that in order to realize the rapid acquisition capability of the features in the backbone network, the backbone network is constructed by utilizing the shufflenet v2 lightweight network, so that the detection speed is faster and the parameter quantity is smaller; the second point is that the Sigmoid activation function can help the network model to improve performance, but the method is in an exponential form and has complex calculated amount, the curve of the H-Sigmoid function is similar to the Sigmoid function, the H-Sigmoid activation function has no exponential operation, compared with the Sigmoid function, the calculated amount can be reduced, and the reasoning time of the network is saved. In the CA attention module, the original Sigmoid function is replaced by the H-Sigmoid function, and an improved attention mechanism module CA_H is constructed based on the CA attention mechanism; the third point is that a lightweight network GhostNet module is used for carrying out lightweight deployment on a network model in a neck part, and because the CA_H attention module contains space direction information and can solve the problem of long-range dependence, the CA_H attention module can search a place of interest in an activation graph, enrich semantic information of the activation graph and exclude some useless information, and the CA_H module can obtain enough characteristic information to help the model to generate a target frame in combination with the GhostConv module; the fourth point is that the original up-sampling operation is replaced by a lightweight up-sampling operator CARAFE, and the boundary of the CARAFE up-sampling operator receptive field is larger, so that the loss of useful information can be prevented after up-sampling is performed by taking a feature map obtained by Ghost convolution as the up-sampling input; the fifth point is that the detection effect of the model can be improved by using the SIoU loss function.
In the preferred embodiment provided by the present invention, the modules designed and used in the present invention are specifically arranged and function as follows.
(1) SheffeNetv 2 lightweight network module
A ShuffleNetv2 lightweight network is used in the backbone network to build a lightweight fast feature extraction network. The basic component of the Shuffle net v2 lightweight network is Shuffle Block, which is divided into two parts according to the step size, as shown in fig. 4, which specifically illustrates the Shuffle Block in the Shuffle net v2 network. When the step size is 1, according to the third principle of the four principles proposed by the shufflenet 2, the input channel is uniformly divided into two parts by using the channel separation operation, and one branch is taken as a Shortcut, so that all operations are not executed. The other branch uses a convolution with 2 convolution kernels being 1 and a depth convolution with 1 convolution kernel being 3 according to a second principle of four principles proposed by the SheffeNetv 2, then the two branches are spliced and fused together, the first principle and the fourth principle of the four principles proposed by the SheffeNetv 2 are met, and finally the characteristic information of the two channels is rearranged and combined in a disordered way by utilizing channel re-washing. When the step length is 2, the input end does not use channel separation operation, so the channel dimension of the output is doubled, 1 3x3 depth separable convolution and 1 standard convolution are added in one non-operation branch with the original step length of 1, and other operations are unchanged.
(2) CA_H attention mechanism module (CA_ H attention mechanism module)
For the SE module which only pays attention to the channel and loses the characteristic position information, the CA attention module is improved based on the defects of the SE module, the two-dimensional channel is split into one-dimensional characteristic channels in the vertical direction and the horizontal direction, then the characteristic channels containing the space direction information are encoded, and each channel effectively integrates the position information of the input characteristic along the specific direction, so that the problem of long-range dependence can be solved. Finally, multiplying the two channels with the original input features to enrich the position information of the feature map. Because of its small size and small number of parameters, it can be inserted and used into the network model. The main structure of the CA module is shown in FIG. 5.
The Sigmoid activation function can help the network model to improve the performance, but is in an exponential form, the calculated amount is complex, the H-Sigmoid activation function is similar to the Sigmoid activation function as far as possible, the H-Sigmoid activation function has no exponential operation, the calculated amount can be reduced, and the reasoning time of the network is saved. Therefore, in the CA attention module, the original Sigmoid function is replaced by the H-Sigmoid function, and a new module CA_H which is improved based on the CA attention mechanism is constructed. As shown in fig. 6.
(3) GhostNet light-weight network module
A GhostConv module has been proposed in GhostNet to process features, obtain redundant information and enrich feature graphs by using lower cost operations. The main structure of the GhostConv module is shown in fig. 7, the module firstly performs downsampling through standard convolution to reduce the parameter amount by half, then performs channel-by-channel convolution on the convolved result, and finally performs channel combination on the first-step convolution result and the second-step convolution result. GhostConv is effective in reducing the amount of parameters for convolution operations compared to normal convolution.
Setting the input data as X epsilon R c×h×w The convolution kernel is K, and the result Y epsilon R is output h '×w' ×n The calculated amount after the convolution operation is c×k×h '×w' ×n, where n is the number of convolution kernels and c is the number of channels. Some similar feature maps in the GhostConv module are not required to be extracted again, are called as 'ghost', can be converted through some operations with small calculation amount and low cost, set the channel number of the ghost as m, and each feature map is processed for s times with small costThe operation is converted, and then n=m×s feature maps can be obtained. In the GhostConv, the last operation is an identity operation, so m times (s-1) conversion operations are taken into account, the convolution kernel size of the conversion operations is uniformly set to d in consideration of the operation efficiency of a real scene, and the calculation amount ratio of the common convolution operation to the Ghost convolution operation is shown in the following formula, wherein d and k are not greatly different, and s < c.
From this, the GhostConv operation can save a lot of calculation amount compared with the common convolution operation, and has good advantages in terms of accuracy and model parameter size.
(4) CARAFE up-sampling operator module
The upsampling operator with good performance tends to have a large receptive field, so that the domain information can be efficiently utilized. Secondly, the size value of the upsampling kernel should be dynamically matched with the semantic information of the activation graph, and upsampling is implemented according to the input content. The other is that a large number of parameters are avoided in the up-sampling operation, the calculation complexity is reduced, and the effect of light weight is realized. The CARAFE upsampling operator has the characteristics that the CARAFE has large receptive field when the features are rearranged, can dynamically perform upsampling operation according to input data, and has small calculated amount. The CARAFE is divided into a nuclear prediction part and a characteristic recombination part. When the input data is transmitted to the module, the input data firstly enters the core prediction part, the up-sampling core is dynamically matched according to the input content, and then up-sampling is realized through the content-aware reorganization module.
(5).SIoU Loss
None of the regression loss functions previously proposed discuss the angle problem of the predicted and real frames. For this problem, SIoU Loss is proposed, and factors of distance, shape, ioU and angular direction are comprehensively considered in calculating penalty metrics of a real target frame and a model prediction anchor frame in an image. The added angular direction penalty term greatly facilitates the training process because it facilitates the prediction box to be quickly moved to the nearest axis, the latter method only requiring regression of one coordinate X or Y. In other words, the added angular direction penalty term can reduce the total number of degrees of freedom, speed up the calculation of the distance between the object real frame and the model prediction bounding box in the image, and speed up the convergence of the object real frame and the model prediction bounding box in the image.
The SIoU regression loss function consists of angle loss, distance loss, shape loss, and IoU loss. The angular loss definition is given in equations 1 and 2.
Wherein the method comprises the steps ofAnd->Representing the central coordinate value of the real frame b cx And b cy The central coordinate value of the prediction frame is represented.
The distance loss is defined in formula 3, wherein c h And c w Is defined as the height and width of the smallest rectangle that encloses both anchor boxes.
The definition of the shape loss Ω is given in equation 4, where the definition of w and h is the width and height of the model output bounding box, w gt And h gt Is defined as the width and height of the actual box of the object, θ is a variable factor, and is the weight of the shape loss.
IoU loss definition is given in equation 5. Wherein a and B represent two rectangular frames respectively, specifically, a real frame a and a predicted frame B, and the value of IoU is equal to the intersection of the two rectangular frames to the union of the two rectangular frames (as shown in fig. 10);
the final SIoU regression loss function definition is given in equation 6.
The invention also provides an embodiment for exemplarily displaying the execution process of the method provided by the invention.
The invention relates to a lightweight texture surface defect detection method based on re-washing and ghost features, wherein the basic flow is shown in fig. 8, and the implementation of the invention mainly comprises the following stages of a preprocessing stage, a feature extraction stage, a loss calculation stage, a model iteration optimization stage, a model test evaluation stage and the like. The specific operation of each stage is explained in detail below.
Before the prior art is used, a technician is required to perform configuration work of related links, wherein the configuration work comprises a development environment for installing a Linux operating system, python 3.8 (and the versions above), and a depth frame of PyTorch1.11 (and the versions above), because the algorithm used by the invention is a model algorithm based on deep learning, a training process for performing the model in a GPU environment is recommended, and a Pytorch1.11 (and the versions above) of a GPU version and a CUDA parallel computing architecture of a corresponding version are required to be installed.
Input of algorithm:
1. texture image data: the method comprises a training set and a testing set, wherein a training image is used for training the capability of the model to extract characteristics, and the testing set is used for verifying the performance of the model.
2. Model algorithm hyper-parameters: including image size, batch size in training, iteration number and learning rate, optimizer momentum factor, etc.
Output of the algorithm:
and obtaining the trained parameter weights of the model algorithm reaching the performance evaluation standard.
The method comprises the following steps:
and (3) a step of: pretreatment stage
Step 1-1: converting the weakly marked data into data in the form of anchor frames;
step 1-2: loading texture pictures (comprising training set and test set data) into a GPU video memory;
step 1-3: and (3) using the Mosaic data for enhancement, randomly selecting four pictures for operations such as zooming and cutting, and finally synthesizing into an image.
2. Feature extraction stage
Step 2-1: firstly, obtaining characteristic information in a backbone network through a 3x3 convolution layer, a normalization layer, a ReLu activation layer and maximum pooling, and then, passing through a Shuffle block with a step length of 2, three Shuffle blocks with a step length of 1, a Shuffle block with a step length of 2, seven Shuffle blocks with a step length of 1, a Shuffle block with a step length of 2 and three Shuffle blocks with a step length of 1;
step 2-2: features of the backbone network output are fed into the neck part, in particular the neck can be divided into four parts: the first part consists of a GhostConv, a CARAFE up-sampling operator, a fourth layer of spliced backbone network and a C3Ghost superposition; the second part consists of a CA_H attention mechanism, a GhostConv, a CARAFE up-sampling operator, a spliced backbone network second layer and a C3Ghost superposition; the third part consists of a CA_H attention mechanism, a GhostConv, a twelfth layer of a spliced network structure and a C3Ghost superposition; the fourth layer is composed of a CA_H attention mechanism, a GhostConv, a seventh layer of a spliced network structure and a C3Ghost superposition.
Step 2-3: and predicting the fifteenth, nineteenth and twenty third output results in the network structure to obtain a target classification prediction result.
3. Loss calculation stage
Step 3-1: for positioning Loss, a SIoU Loss calculation is used; for confidence and classification loss, a binary cross entropy function is used to calculate;
step 3-2: the three losses calculated above are added to obtain the final loss.
4. Model optimization stage
Step 4-1: the code implementation is based on a PyTorch deep learning framework, and can be counter-propagated from the finally calculated composite loss value, and the gradient value of the parameters in the model is automatically calculated;
step 4-2: using the gradients calculated in the previous step, updating the learnable parameter values of the model algorithm using an optimizer (e.g. a SGD optimizer of Pytorch);
step 4-3: repeating all the execution steps before the model reaches the number of rounds set by the super parameters, and stopping the training process of the model after the number of rounds is reached.
5. Test evaluation stage
Step 5-1: the texture image of the test set is read and loaded to the GPU video memory, and standardized operation which is the same as that of the training link is carried out (note that image enhancement is not needed during test);
step 5-2: and (3) adopting Mean Average Precsion (mAP), parameter quantity, calculated quantity and Frames Per Second (FPS) evaluation indexes commonly used in the defect detection task, and primarily evaluating the model quality by evaluating the calculated index values.
Step 5-3: if the evaluation result does not meet the requirement, the super parameters of the model need to be adjusted, the first step of the execution step is returned, the training link of the model is carried out again, and if the evaluation result meets the requirement, the model weight can be saved, so that the solution of the lightweight defect detection task is obtained.
In a second aspect, the present invention provides a system for performing the above method, comprising a training module 801 and a testing module 802;
the training module 801 has a training set, and is further configured to:
based on the texture surface image of the training set, a prediction frame with defect characteristics is obtained through forward propagation processing of layer-by-layer convolution in the network model;
obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to a network model to update model parameters;
repeatedly executing the process until the preset iteration times are reached, and obtaining a texture surface defect image;
the test module 802 adds the texture surface defect image output by the training module to its own test set, and is further configured to: and (3) testing and evaluating the texture surface defect image, if the evaluation result does not reach the preset requirement, modifying the super parameters of the network model, returning to the execution process of the execution training module, and otherwise, outputting the texture surface defect image.
In summary, the present invention provides a method and a system for detecting defects on a lightweight texture surface based on deep learning, wherein the method is divided into training and testing stages. The training stage is based on texture surface images of an input training set, the texture surface images are forward propagated layer by layer to obtain a prediction frame of the defect characteristics, the prediction frame of the defect characteristics is obtained, then loss between the prediction frame of the defect characteristics and a real frame of a target image is calculated, reverse propagation is carried out by using the loss, model weights are updated, and the process is repeated until the set iteration round number epoch is reached. And in the test stage, loading data of a test set, outputting the category and the position of the defect image through a trained model, performing evaluation index calculation, judging the performance of the model according to the index, returning to a training link again if the expected requirement cannot be met, performing further adjustment training, and storing model weights if the expected performance is reached, so that the flow of the whole technical invention is completed, and a final solution is obtained. The method and the system provided by the invention have the following advantages:
1. compared with the existing lightweight obstacle detection model method for improving the YOLOv5s detection network, the method has the advantages of good detection effect and 97.9% precision.
2. The invention utilizes two lightweight networks, an attention mechanism, a lightweight upsampling operator and a SIoU loss function to ensure that the model parameter quantity is small, and the whole model is only 0.62MB.
Those of ordinary skill in the art will appreciate that: the drawing is a schematic diagram of one embodiment and the modules or flows in the drawing are not necessarily required to practice the invention.
From the above description of embodiments, it will be apparent to those skilled in the art that the present invention may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with reference to the description of method embodiments in part. The apparatus and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims (5)

1. The lightweight texture surface defect detection method based on deep learning is characterized by comprising the following steps of:
s1, obtaining a prediction frame with defect characteristics through layer-by-layer convolution forward propagation processing of a network model provided with a SheffeNetv 2 lightweight network based on a texture surface image of a training set;
s2, obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to the network model to update model parameters;
s3, repeatedly executing the steps S1 and S2 until the preset iteration times are reached, and obtaining a texture surface defect image;
s4, testing and evaluating the texture surface defect image, if the evaluation result does not meet the preset requirement, modifying the super parameters of the network model, returning to the steps S1 to S3, and otherwise, outputting the texture surface defect image.
2. The method of claim 1, wherein the network model comprises a backbone network, a neck, and an output, sequentially arranged along a data flow direction;
the backbone network has a 3x3 convolutional layer, a normalization layer, a ReLu activation layer and a maximum pooling layer which are sequentially arranged along the data flow direction, a first shuffleNetv2 layer consisting of one Shuffle block with a step length of 2, a second shuffleNetv2 layer consisting of three Shuffle blocks with a step length of 1, a third shuffleNetv2 layer consisting of one Shuffle block with a step length of 2, a fourth shuffleNetv2 layer consisting of seven Shuffle blocks with a step length of 1, a fifth shuffleNetv2 layer consisting of one Shuffle block with a step length of 2, and a sixth shuffleNetv2 layer consisting of three Shuffle blocks with a step length of 1;
each Shuffle block includes a Shortcut branch and a deep convolution branch; when the step length is 1, the Shortcut branch and the depth convolution branch respectively perform characteristic extraction operation, then the extracted characteristic information is recombined through channel re-washing, and when the step length is 2, the Shortcut branch and the depth convolution branch are combined;
the neck portion includes:
the first part is formed by overlapping a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the first part is used for splicing the backbone network;
the second part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a CARAFE up-sampling operator, a splicing module and a C3Ghost module; the splicing module of the second part is used for splicing the backbone network;
the third part consists of a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost superposition; the splicing module of the third part is used for splicing the GhostConv module of the second part;
the fourth part is formed by overlapping a CA_H attention mechanism module, a GhostConv module, a splicing module and a C3Ghost module; the splicing module of the fourth part is used for splicing the GhostConv module of the first part;
the CA_H attention mechanism module is provided with an H-sigmoid activation function;
the output part is provided with three GhostConv modules which are respectively connected with the second part, the third part and the fourth part of the neck part.
3. The method according to claim 1, wherein each of the GhostConv modules has a standard convolution sub-module and a channel-by-channel convolution sub-module sequentially arranged along a data flow direction, and the output result of the GhostConv module is an output result of channel-merging the convolution result of the standard convolution sub-module and the convolution result of the channel-by-channel convolution sub-module.
4. The method of claim 1, wherein calculating the loss between the predicted box with the defect feature and the target image real box is accomplished by a SIoU regression loss function, including angle loss, distance loss, shape loss, and IoU loss;
the angle loss is calculated by the following formula
In the method, in the process of the invention,and->Representing the central coordinate value of the real frame b cx And b cy A central coordinate value representing a prediction frame;
the distance is lost through
Calculating; wherein, c h And c w Is defined as the height and width of the smallest rectangle that encloses both anchor frames;
the shape loss passing through type
Calculating; where w and h are defined as the width and height of the model output bounding box, w gt And h gt Is defined as the width and height of the actual frame of the object, θ is a variable factor representing the weight of the shape loss;
the IoU is lost through
Calculating; wherein A and B respectively represent two rectangular frames;
based on the formulas (1) to (5), SIoU regression loss function formulas are obtained
5. The lightweight texture surface defect detection system based on deep learning is characterized by comprising a training module and a testing module;
the training module has a training set, and is further configured to:
based on a texture surface image of a training set, obtaining a prediction frame with defect characteristics through layer-by-layer convolution forward propagation processing of a network model provided with a shufflenet 2 lightweight network;
obtaining loss between a prediction frame with defect characteristics and a target image real frame through calculation, and reversely transmitting the loss to the network model to update model parameters;
repeatedly executing the process until the preset iteration times are reached, and obtaining a texture surface defect image;
the test module adds the texture surface defect image output by the training module into a test set of the test module, and is also used for: and testing and evaluating the texture surface defect image, if the evaluation result does not reach the preset requirement, modifying the super parameters of the network model, returning to the execution process of the training module, and otherwise, outputting the texture surface defect image.
CN202310591633.3A 2023-05-24 2023-05-24 Light texture surface defect detection method and system based on deep learning Pending CN116777842A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310591633.3A CN116777842A (en) 2023-05-24 2023-05-24 Light texture surface defect detection method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310591633.3A CN116777842A (en) 2023-05-24 2023-05-24 Light texture surface defect detection method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN116777842A true CN116777842A (en) 2023-09-19

Family

ID=87990541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310591633.3A Pending CN116777842A (en) 2023-05-24 2023-05-24 Light texture surface defect detection method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN116777842A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496475A (en) * 2023-12-29 2024-02-02 武汉科技大学 Target detection method and system applied to automatic driving

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117496475A (en) * 2023-12-29 2024-02-02 武汉科技大学 Target detection method and system applied to automatic driving
CN117496475B (en) * 2023-12-29 2024-04-02 武汉科技大学 Target detection method and system applied to automatic driving

Similar Documents

Publication Publication Date Title
CN112561027A (en) Neural network architecture searching method, image processing method, device and storage medium
CN113269312B (en) Model compression method and system combining quantization and pruning search
CN116777842A (en) Light texture surface defect detection method and system based on deep learning
CN115546492B (en) Image instance segmentation method, system, equipment and storage medium
CN114463759A (en) Lightweight character detection method and device based on anchor-frame-free algorithm
CN114492723A (en) Neural network model training method, image processing method and device
CN115601692A (en) Data processing method, training method and device of neural network model
CN116469100A (en) Dual-band image semantic segmentation method based on Transformer
CN116090517A (en) Model training method, object detection device, and readable storage medium
CN114898171B (en) Real-time target detection method suitable for embedded platform
CN114742997A (en) Full convolution neural network density peak pruning method for image segmentation
Ye et al. Light-YOLOv5: A lightweight algorithm for improved YOLOv5 in PCB defect detection
CN117037258B (en) Face image detection method and device, storage medium and electronic equipment
CN116432736A (en) Neural network model optimization method and device and computing equipment
CN115861861B (en) Lightweight acceptance method based on unmanned aerial vehicle distribution line inspection
CN109558819B (en) Depth network lightweight method for remote sensing image target detection
de VIEILLEVILLE et al. Towards distillation of deep neural networks for satellite on-board image segmentation
CN116502675A (en) Transformer neural network system and operation method thereof
CN116051532A (en) Deep learning-based industrial part defect detection method and system and electronic equipment
WO2021238734A1 (en) Method for training neural network, and related device
Liu et al. Real-time object detection in UAV vision based on neural processing units
ZiWen et al. Multi-objective Neural Architecture Search for Efficient and Fast Semantic Segmentation on Edge
Ding et al. Research on the Application of Improved Attention Mechanism in Image Classification and Object Detection.
Huang et al. Block-wise Separable Convolutions: An Alternative Way to Factorize Standard Convolutions
CN117422644A (en) Depth image complement method based on transducer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination