CN117635585A - Texture surface defect detection method based on teacher-student network - Google Patents
Texture surface defect detection method based on teacher-student network Download PDFInfo
- Publication number
- CN117635585A CN117635585A CN202311663492.8A CN202311663492A CN117635585A CN 117635585 A CN117635585 A CN 117635585A CN 202311663492 A CN202311663492 A CN 202311663492A CN 117635585 A CN117635585 A CN 117635585A
- Authority
- CN
- China
- Prior art keywords
- feature
- scale
- defect
- image
- normal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000007547 defect Effects 0.000 title claims abstract description 183
- 238000001514 detection method Methods 0.000 title claims abstract description 79
- 230000011218 segmentation Effects 0.000 claims abstract description 42
- 238000011084 recovery Methods 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 9
- 238000004821 distillation Methods 0.000 claims description 50
- 239000013598 vector Substances 0.000 claims description 49
- 238000000034 method Methods 0.000 claims description 26
- 230000008569 process Effects 0.000 claims description 15
- 230000008447 perception Effects 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 238000013459 approach Methods 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000013507 mapping Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 5
- 230000006872 improvement Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000009776 industrial production Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000007847 structural defect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/52—Scale-space analysis, e.g. wavelet analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/809—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the field of image processing, and particularly relates to a texture surface defect detection method based on a teacher-student network, which is realized by adopting a defect detection model which is trained based on texture surface defect images and comprises a student network and a double-branch decoding module, and comprises the following steps: respectively encoding images to be detected by adopting a pre-training teacher network and a student network; carrying out semantic segmentation and feature recovery on the deepest scale features in the multi-scale features obtained by the student network by adopting a double-branch decoding module, and correspondingly obtaining a defect segmentation image and pseudo normal features with the same multi-scale as the multi-scale features obtained by the teacher network; calculating cosine similarity between the pseudo-normal features under each scale and features obtained by a teacher network to construct a global anomaly score map; and fusing the defect segmentation image and the global anomaly score image to obtain a detection result image, and completing defect detection. The invention can realize accurate detection of various texture defects in complex industrial environments.
Description
Technical Field
The invention belongs to the field of image processing, and particularly relates to a texture surface defect detection method based on a teacher-student network.
Background
Modern industrial production generally involves high automation and high speed production, and the production flow becomes more and more complex, resulting in defects or flaws on the surface of many products, which inevitably affect the yield and production efficiency of the production line seriously. With the continuous progress of computer image processing technology, a texture surface defect detection method based on machine vision is continuously emerging. However, the normal area and the defect area on the surface of most texture products usually have low contrast characteristics, and are particularly characterized by no obvious excessive area, so that clear imaging and clear outline boundaries are difficult to distinguish, and the detection effect is poor due to serious interference of texture background. And the defect area in the case of irregular texture and regular texture has different degrees of interference compared with the texture area, which requires that the detection technology has good generalization. Robust detection of texture surface defects has long been a challenging task due to the complex diversity of textures and defects and the unpredictable, large scale variation of defects.
With the increasing importance of detection of surface defects of products by various large-scale manufacturers, the traditional mode of distinguishing defects by means of human eyes is unstable and low in efficiency due to large subjectivity, and is gradually replaced by modern automatic optical detection (Automatic Optical Inspection, AOI) equipment with the advantages of non-contact, high speed, high precision and the like, and the theoretical basis of the AOI technology is derived from machine vision and mainly comprises four parts of visual imaging, visual positioning, visual detection and visual classification, wherein the core is a detection algorithm of the visual detection part. The upper limit of the detection effect of the AOI equipment generally depends on the performance of a detection algorithm, so that the high-performance texture surface defect detection algorithm is researched, the high-precision and high-robustness detection of various texture surface defects is realized, and the method has important significance in improving the quality and efficiency of automatic industrial production.
A large number of texture surface defect detection algorithms have been proposed at present, but the algorithms are difficult to achieve high-precision and high-efficiency detection performance in actual production scenes. The image reconstruction-based method attempts to reconstruct a defective image into a normal image and then calculates a residual error between the original image and the reconstructed image to detect the defect, but the method easily reconstructs the defect well in the reconstruction process, resulting in missed detection. The method based on the embedded similarity needs to continuously calculate and store the embedded vector of the normal image in the training process, then calculates the distance between the embedded vector of the image to be detected and the stored normal embedded vector in the testing stage, and the defect is judged if the distance is far away.
Therefore, in order to promote the improvement of the production process and the improvement of the yield, a robust texture surface defect detection algorithm is required to be provided, and excellent detection effects can be realized when facing complex and changeable production environments and texture surface defects.
Disclosure of Invention
Aiming at the defects and improvement requirements of the prior art, the invention provides a texture surface defect detection method based on a teacher-student network, and aims to provide a robust texture surface defect detection algorithm which can accurately detect various texture defects in a complex industrial environment.
To achieve the above object, according to one aspect of the present invention, there is provided a texture surface defect detection method based on a teacher-student network, implemented using a texture surface defect detection model including a student network and a dual-branch decoding module trained based on texture surface defect images, comprising:
respectively encoding the texture surface images to be detected by adopting a pre-training teacher network and the student network to obtain corresponding multi-scale characteristics;
the double-branch decoding module is adopted to perform semantic segmentation and feature recovery on the deepest scale feature in the multi-scale features obtained by the student network, and a defect segmentation image and a pseudo normal feature with the same multi-scale as the multi-scale features obtained by the teacher network are correspondingly obtained;
cosine similarity between each scale feature in the multi-scale pseudo-normal features and corresponding scale features in the multi-scale features obtained by the teacher network is calculated, and a corresponding abnormal score graph is obtained; up-sampling the anomaly score graphs of all scales to the size of the texture surface image to be detected and multiplying the image by each pixel to obtain a global anomaly score graph;
and fusing the defect segmentation image and the global anomaly score image to obtain a detection result image, and finishing texture surface defect detection.
Further, the texture surface defect image is an artificial defect image in a training stage, and is constructed by the following steps:
carrying out enhancement treatment on a normal image of a textured surface; performing threshold processing and binarization on a noise image generated by Berlin noise to obtain an artificial defect mask image; and fusing the other Zhang Wenli surface normal image, the enhanced texture surface normal image and the artificial defect mask image to obtain an artificial defect image.
Further, the loss function used in the training process of the texture surface defect detection model comprises: pixel level contrast decoupling distillation loss, semantic segmentation loss, and feature recovery loss;
the pixel level contrast decoupling distillation loss is used for explicitly guiding the degree of distinction of the student network for normal feature coding and defect feature coding in the image, and the construction mode is as follows: downsampling the artificial defect mask image to obtain semantic tags of texture and defect characteristics; according to the semantic tag, decoupling and separating two types of textures and defects in the artificial defect feature embedding to obtain a normal feature vector set and a defect feature vector set, and calculating pixel-level contrast decoupling distillation loss based on the normal feature vector set and the defect feature vector set so as to improve the similarity between the normal feature vector set and the normal feature embedding and reduce the similarity between the defect feature vector set and the normal feature embedding; the artificial defect feature is embedded into the artificial defect feature of the deepest scale in the multi-scale artificial defect features obtained by encoding the artificial defect image through a student network; the normal feature is embedded as follows: encoding the normal image on the surface of the other Zhang Wenli by a pre-training teacher network, and fine-tuning the deepest scale feature in the multi-scale normal features obtained by encoding by a linear mapper in the space dimension;
the semantic segmentation loss is used for supervising a defect segmentation image output by a semantic segmentation branch in the double-branch decoding module to be close to the artificial defect mask image;
and the characteristic recovery loss is used for supervising the characteristic recovery branch in the double-branch decoding module to ensure that the output multi-scale pseudo normal characteristic approaches to the multi-scale normal characteristic.
Further, the pixel level versus decoupling distillation loss is expressed as:
wherein L is intra Representing the pixel level versus decoupling distillation loss, VN represents the number of normal feature vectors in the normal feature vector set,representing the ith normal feature vector, VD represents the number of defect feature vectors in the set of defect feature vectors,/or>Representing the j-th defect feature vector,/th defect feature vector>Representing the normal feature embedding and +.>Feature vectors located at the same spatial position +.>Representing the normal feature embedding and +.>The eigenvectors at the same spatial location, τ, represent the temperature parameters used to smooth the data distribution, sim (·) represent cosine similarity.
Further, the pixel level versus decoupling distillation loss is expressed as:
wherein L is c Representing the pixel level versus decoupling distillation loss,another normal feature embedding in random representing the same batch and +.>Feature vectors located at the same spatial position +.>Representing the other normal feature embedded and +.>Feature vectors located at the same spatial location.
Further, the feature recovery loss is specifically global feature mask perceived distillation loss, and the construction mode is as follows:
performing mask perception distillation on each scale feature in the multi-scale normal features by adopting a semantic mark to be trained together with a detection model, and generating a first feature level mask with a corresponding scale; performing mask perception distillation on each scale feature in the multi-scale pseudo normal features by adopting the trainable semantic mark to obtain a second feature level mask with a corresponding scale; fusing the first feature level mask and the second feature level mask under each scale to obtain a global perception mask under the scale;
sampling a dense supervision mode, and calculating an L2 distance between a normal feature and a pseudo normal feature in each scale in a space dimension, wherein the L2 distance is used for combining a global perception mask in the scale, and calculating a feature mask perception distillation loss in the scale; and synthesizing the feature mask perceived distillation loss under all scales to obtain the global feature mask perceived distillation loss.
Further, the characteristic mask perceived distillation loss at each scale is expressed as:
wherein H is k Representing feature height, W, at the kth scale k Representing at the kth scaleFeature width, M k Representing a global perceptual mask at a kth scale, D k Representing said L2 distance at the kth scale.
Further, the semantic segmentation penalty is expressed as:
in the method, in the process of the invention,representing the desire to calculate the value of the calculated value, I.I 1 Represents L1 norm, I m Representing the mask image of the artificial defect, I s Representing a defect segmented image, I d Representing the artificial defect image.
The invention also provides a computer readable storage medium comprising a stored computer program, wherein the computer program, when run by a processor, controls a device in which the storage medium is located to perform a texture surface defect detection method as described above.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) The texture surface defect detection method provided by the invention is realized by combining a pre-training teacher network and a constructed texture surface defect detection model comprising a student network and a double-branch decoding module, wherein the double-branch decoding module can realize feature recovery and semantic segmentation. Firstly, coding an image to be tested by adopting a student network to obtain multi-scale features, respectively carrying out semantic segmentation and feature recovery on the deepest scale features in the multi-scale features by a double-branch decoding module, wherein the teacher network can also code the image to be tested to obtain another multi-scale feature during the period, the scale number of pseudo normal features obtained by the feature recovery is required to be the same as the scale number of the features obtained by the teacher network, so as to calculate the similarity between the pseudo features under each scale and the features obtained by the teacher network to construct a global abnormal score map, and the process fully utilizes the knowledge of the pre-trained teacher network to be generalized into a texture surface defect detection task and combines the prior knowledge of the teacher network to realize fine defect segmentation; further, by combining the defect segmentation image and the global anomaly score map obtained by semantic segmentation, the defect segmentation image obtained by semantic segmentation branches is finer as a final detection result, is suitable for detecting structural defects, has a clear outline, and is good in robustness and suitable for detecting tiny logic defects based on the global anomaly score map obtained by feature recovery branches, so that the invention provides a method for synthesizing the results of the two branches, can greatly improve the detection precision of various texture surface defects, and has high application value.
(2) According to the invention, when the texture surface defect detection model is trained, pixel-level contrast decoupling distillation loss is introduced, and the student network is guided to effectively decouple and separate normal features and defect features by explicitly driving the network to learn the differential characterization of the two features in the training process, so that the discrimination capability of the model can be effectively enhanced, and the detection performance is further improved.
(3) According to the invention, when the texture surface defect detection model is trained, global feature mask perceived distillation loss is also introduced, and the normal features of each scale obtained by teacher network coding are combined, so that the transmission of effective knowledge by feature recovery branches in a double-branch decoding module is promoted, valuable prototype priori feature information is more concerned when the features are recovered by the network, the context dependency relationship among multi-level features is comprehensively perceived, the detection rate is further improved, and the over-detection rate is reduced.
Drawings
FIG. 1 is a flow chart of a texture surface defect detection method based on a teacher-student network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a training process of a texture surface defect detection model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an implementation of a perception mask according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a feature-to-decoupling effect provided by an embodiment of the present invention;
fig. 5 is a schematic diagram of defect detection effect according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Example 1
The texture surface defect detection method based on the teacher-student network is realized by adopting a texture surface defect detection model which is trained based on texture surface defect images and comprises a student network and a double-branch decoding module, as shown in figure 1, and comprises the following steps:
respectively encoding the texture surface images to be detected by adopting a pre-training teacher network and the student network to obtain corresponding multi-scale characteristics;
by adopting the double-branch decoding module, semantic segmentation and feature recovery are carried out on the deepest scale features in the multi-scale features obtained by the student network, and a defect segmentation image and pseudo normal features with the same multi-scale as the multi-scale features obtained by the teacher network are correspondingly obtained;
cosine similarity between each scale feature in the multi-scale pseudo-normal features and corresponding scale features in the multi-scale features obtained by the teacher network is calculated, and a corresponding abnormal score graph is obtained; up-sampling the anomaly score graphs of all scales to the size of the texture surface image to be detected and multiplying the image by each pixel to obtain a global anomaly score graph;
and fusing the defect segmentation image and the global anomaly score image to obtain a detection result image, and finishing texture surface defect detection.
In a preferred embodiment, the texture surface defect image is an artificial defect image in the training stage, and is constructed by the following steps:
carrying out enhancement treatment on a normal image of a textured surface; performing threshold processing and binarization on a noise image generated by Berlin noise to obtain an artificial defect mask image; and fusing the other Zhang Wenli surface normal image, the enhanced texture surface normal image and the artificial defect mask image to obtain an artificial defect image.
Specifically, taking the generation process of a single artificial defect image as an example, firstly, randomly selecting two normal images I by adopting a non-replacement mode t And I n . Then, a data enhancement transform set is constructed: gamma correction, brightness transformation, hue transformation, overexposure enhancement, pixel inversion, gaussian blur, elastic transformation, and preferably three transformations are randomly selected from the set to handle I t Obtaining an enhanced image I u :
I u =f aug (I t );
Wherein I is u 、I t ∈R W×H×C W, H, C each represents the width, height, and number of channels of the image, and is exemplified by 256, and 3 in this embodiment; f (f) aug (. Cndot.) represents a mapping function consisting of three random transformations.
Then, noise image I generated for Berlin noise p Threshold processing and binarization are carried out to obtain an artificial defect mask image I m :
I m =f tb (I p );
Wherein I is m ,Illustratively, in this embodiment C m 1 is selected; f (f) tb (. Cndot.) represents a threshold binarization operation.
Finally, the image I is fused according to the following formula u 、I n And I m Obtaining an artificial defect image I d :
I d =I n ⊙(1-I m )+I u ⊙I m ;
Wherein I is d ,I n ∈R W×H×C The method comprises the steps of carrying out a first treatment on the surface of the The addition of the signal indicates a pixel-by-pixel multiplication operation,i.e. Hadamard product.
Further, the training process of the texture surface defect detection model may be:
coding the artificial defect image by adopting a student network to be trained to obtain multi-scale artificial defect characteristics, and embedding the artificial defect characteristics of the deepest scale of the multi-scale artificial defect characteristics as corresponding artificial defect characteristics;
the method comprises the steps of respectively carrying out semantic segmentation and feature recovery on the artificial defect feature embedding by adopting a double-branch decoding module to be trained, and correspondingly obtaining a defect segmentation image and multi-scale pseudo-normal features with the same scale number as the multi-scale normal features; the multi-scale normal characteristic is that a pre-training teacher network is used for carrying out normal image I on the texture surface n Coding to obtain;
based on the loss function, optimizing parameters of the student network to be trained and the double-branch decoding module, and repeating the process until convergence to obtain the texture surface defect detection model.
In this embodiment, the teacher network and the student network may be regarded as a feature extraction mapping module as a whole. As shown in fig. 2, in the training process of the texture surface defect detection model, the feature extraction mapping module first respectively performs the following on the normal texture surface image I n Corresponding artificial defect image I d Coding:
wherein,and->Respectively representing the normal image I n And correspond toIs an artificial defect image I of (1) d Coding multi-scale features,/->Representing three-dimensional features of the kth layer, N n And N d Representing the number of respective feature layer levels, i.e. the number of scales, respectively, chosen as 2, 3, respectively, as an example; f (f) T (. Cndot.) and θ T Respectively representing functions and parameters of a teacher network in the feature extraction mapping module, wherein the teacher network is pre-trained and the parameters are frozen and not updated, for example ResNet18 can be selected as the teacher network; f (f) Se (. Cndot.) and θ Se Representing the functions and parameters of the student network in the extraction mapping module, respectively.
Considering that the domain difference exists between the pre-training data set of the teacher network and the target data set of the algorithm, the feature extraction mapping module adopts a linear mapper to the deepest scale feature in the multi-scale normal featuresFine tuning in the spatial dimension to obtain the normal feature embedding Z n :
Wherein,as an example, W l 、H l 、C l 32, 128, respectively; f (f) pt (. Cndot.) and θ pt Representing the functions and parameters of the linear mapper described above, respectively. Z is Z n Can be regarded as a set of feature vectors
As a preferred embodiment, the loss function used in the training process of the texture surface defect detection model comprises the following steps: pixel level contrast decoupling distillation loss, semantic segmentation loss, and feature recovery loss.
The pixel level contrast decoupling distillation loss is used for explicitly guiding the student network to distinguish between normal feature codes and defect feature codes in the image, and as shown in fig. 2, the construction mode is as follows: downsampling the mask image of the artificial defect to obtain semantic tags of texture and defect characteristics; according to the semantic tag, decoupling and separating two types of textures and defects in artificial defect feature embedding to obtain a normal feature vector set and a defect feature vector set, and calculating pixel-level contrast decoupling distillation loss based on the normal feature vector set and the defect feature vector set so as to improve the similarity between the normal feature vector set and the deepest scale feature in the multi-scale normal features and reduce the similarity between the defect feature vector set and the deepest scale feature in the multi-scale normal features; the artificial defect feature is embedded into the deepest scale artificial defect feature in the multi-scale artificial defect features obtained by encoding the artificial defect image through the student network.
The above-mentioned construction process of pixel-level contrast decoupling distillation loss can be called contrast decoupling distillation module, and the contrast decoupling distillation module embeds the above-mentioned artificial defect features(i.e., the deepest scale artificial defect feature of the multi-scale artificial defect feature) is viewed in the spatial dimension as being defined by a series of vectors +.>Feature embedding Z of a composition d ,Z d And Z is n Is aligned exactly in each dimension (including spatial and channel dimensions) and Z is embedded by features of the normal image n To guide Z d The contrast between normal and defect features decouples distillation.
First, mask image I for artificial defect m Semantic tags I downsampling to obtain two classes (texture and defect) of features l :
I l =f ds (I m );
Wherein,f ds (. Cndot.) represents a downsampling operation.
Then, according to the label pair setTexture features and defect features in the tree are classified (namely decoupling separation):
wherein,and->Representing normal feature vectors and defect feature vectors, respectively, VN and VD represent the number of the two types of features in the respective sets, respectively.
Second, for convenience of subsequent representation, the cosine similarity between two eigenvectors will be calculated, represented as the sign sim (v i ,v j ):
Wherein, represents the inner product, I.I 2 Representing the L2 norm.
Then, pixel-level contrast decoupling distillation loss L is calculated according to the following formula intra :
Where τ represents a temperature parameter used to smooth the data distribution, which may be set to 0.1, for example.Representing the normal feature embedding and +.>Feature vectors located at the same spatial position +.>Representing the normal feature embedding and +.>Feature vectors located at the same spatial location.
Further, as a preferred embodiment, to enhance the robustness of the algorithm, Z is embedded in the normal features of the same batch n Introducing artificial disturbance, randomly disturbing in batch dimension to form non-aligned vector set Feature vectors in another normal feature embedding representing the same batch of randomness, on the basis of which L is calculated as intra Computing contrast decoupling distillation loss L between global images in the same form inter :
Finally, by L intra And L inter Comparative decoupling distillation loss L of two parts integrated c :
In an embodiment, L c Imparting L intra And L inter Equal weights for both parts indicate the same confidence level for both. As shown in fig. 4, the decoupling distillation loss L is compared c Effective constraints may be provided for facilitating class separability between features.
The double-branch decoding module is used for semantic segmentation and feature recovery, and the semantic segmentation represents:
I s =f Ss (Z d ;θ Ss );
wherein I is s ∈R W×H×1 Representing a defect segmented image, f Ss (. Cndot.) and θ Ss Representing the function and parameters of the semantic segmentation branch, respectively. The semantic segmentation loss is used for supervising the defect segmentation image output by the semantic segmentation branches in the double-branch decoding module to approach the artificial defect mask image, and the segmentation loss L is calculated according to the following formula s :
Wherein,representing the desire to calculate the value of the calculated value, I.I 1 Represents L1 norm, I m Representing an artificial defect mask image, I s Representing a defect segmented image, I d Representing an artificial defect image.
The above feature recovery is expressed as:wherein (1)>Multi-level pseudo-normal feature representing network restoration, N r Representing the number of pseudo normal characteristic layer levels, and taking value and N n Likewise, the exemplary value is 2; f (f) Sr (. Cndot.) and θ Sr Representing the function and parameters of the feature recovery branch, respectively.
The above-mentioned feature recovery loss is used to monitor that the multi-scale pseudo-normal feature outputted by the feature recovery branch in the dual-branch decoding module approaches to the multi-scale normal feature outputted by the teacher network, and may be used as a preferred embodiment, where the feature recovery loss is specifically a global feature mask-aware distillation loss, and the construction of the feature recovery loss may be implemented by a mask-aware distillation module, as shown in fig. 2, where the mask-aware distillation module first introduces a trainable semantic tagThe feature level mask is generated, and specifically as shown in fig. 3, the formulated calculation form of the perception mask at each scale is as follows:
wherein,and->Feature level masks M respectively representing generated after processing normal features and recovered pseudo-normal features by semantic tags k Representing the global perception mask learned by the network; f (f) mg (. Cndot.) and θ mg Representing the function and parameters of mask perceived distillation, respectively.
Then, calculating the L2 distance between the normal feature and the recovered abnormal feature in the space dimension in a dense supervision mode under each scale:
next, the distillation loss is perceived for each scale of feature mask, which is calculated by:
finally, synthesizing the multi-scale loss to obtain global mask perceived distillation loss L m :
Where K represents the number of feature levels (i.e., the number of scales) used in the algorithm, illustratively 2.L (L) m The important contextual information distillation and transfer can be more focused by an effective constraint model in the training process.
During training of the texture surface defect detection model, the distillation loss L can be decoupled according to the obtained comparison c Semantic segmentation penalty L s Mask perceived distillation loss L m And calculating the integral training loss L, and carrying out training iteration on the integral model until convergence to obtain the texture surface defect detection model. The overall loss calculation mode of the algorithm in the training phase is as follows:
L=λ c L c +λ s L s +λ m L m ;
wherein lambda is c、 λ s 、λ m Representing adjustable super parameters for the balance loss term, exemplary set to 0.1, 10, 100, respectively.
After training and obtaining a texture surface defect detection model, an image to be detected can be input in a test stage, characteristics are extracted through a characteristic extraction mapping module, and a teacher network can output multi-level characteristicsThe characteristics of the student network output are processed by a double-branch decoding module to obtain a defect segmentation image I s And recovery characteristics of multiple levels->Exemplary N n And N r All are 2. And obtaining an anomaly score graph of each level by calculating cosine similarity between the recovery characteristics of each level and the output characteristics of the teacher network:
upsampling these anomaly score maps to the input image size and then multiplying pixel-by-pixel to obtain a global anomaly score map I a :
Wherein I is a ∈R W×H×1 U (·) represents an interpolation upsampling operation.
Finally, dividing the defect into images I s And anomaly score graph I a Fusion to obtain a detection result image I r :
I r =G(λ f I a +(1-λ f )I s ·max(I a ));
Wherein I is r ∈R W×H×1 G represents Gaussian denoising, and 0 is less than or equal to lambda f The weight ratio of the two weights is less than or equal to 1, and the exemplary setting is 0.5, which indicates that the confidence degrees of the defect segmentation image and the anomaly score image are the same; max (I) a ) Graph I representing anomaly score a Maximum value of the inner.
By adopting the detection method provided by the embodiment, the detection effect on various texture surface defects is shown in fig. 5, wherein the first row is a test image, the second row is a true image, the third row is a thermodynamic diagram of a detection result on the test image, and the fourth row is a detection result image after binarization processing. As can be seen from the figure, the texture surface defect detection method provided by the embodiment can realize high-precision and high-efficiency detection of texture surface defects with different scales, different shapes, different contrasts and unknown types under the condition of not using any real defect image, and can effectively improve the quality and efficiency of automatic industrial production.
Example two
A computer readable storage medium comprising a stored computer program, wherein the computer program, when executed by a processor, controls a device on which the storage medium resides to perform a texture surface defect detection method as described in the first embodiment.
The related technical solution is the same as the first embodiment, and will not be described herein.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (9)
1. The texture surface defect detection method based on the teacher-student network is characterized by being realized by adopting a texture surface defect detection model which is trained based on texture surface defect images and comprises a student network and a double-branch decoding module, and comprises the following steps of:
respectively encoding the texture surface images to be detected by adopting a pre-training teacher network and the student network to obtain corresponding multi-scale characteristics;
the double-branch decoding module is adopted to perform semantic segmentation and feature recovery on the deepest scale feature in the multi-scale features obtained by the student network, and a defect segmentation image and a pseudo normal feature with the same multi-scale as the multi-scale features obtained by the teacher network are correspondingly obtained;
cosine similarity between each scale feature in the multi-scale pseudo-normal features and corresponding scale features in the multi-scale features obtained by the teacher network is calculated, and a corresponding abnormal score graph is obtained; up-sampling the anomaly score graphs of all scales to the size of the texture surface image to be detected and multiplying the image by each pixel to obtain a global anomaly score graph;
and fusing the defect segmentation image and the global anomaly score image to obtain a detection result image, and finishing texture surface defect detection.
2. The method according to claim 1, wherein the texture surface defect image is an artificial defect image in a training phase, and is constructed by:
carrying out enhancement treatment on a normal image of a textured surface; performing threshold processing and binarization on a noise image generated by Berlin noise to obtain an artificial defect mask image; and fusing the other Zhang Wenli surface normal image, the enhanced texture surface normal image and the artificial defect mask image to obtain an artificial defect image.
3. The textured surface defect detection method of claim 2, wherein the loss function employed in the training of the textured surface defect detection model comprises: pixel level contrast decoupling distillation loss, semantic segmentation loss, and feature recovery loss;
the pixel level contrast decoupling distillation loss is used for explicitly guiding the degree of distinction of the student network for normal feature coding and defect feature coding in the image, and the construction mode is as follows: downsampling the artificial defect mask image to obtain semantic tags of texture and defect characteristics; according to the semantic tag, decoupling and separating two types of textures and defects in the artificial defect feature embedding to obtain a normal feature vector set and a defect feature vector set, and calculating pixel-level contrast decoupling distillation loss based on the normal feature vector set and the defect feature vector set so as to improve the similarity between the normal feature vector set and the normal feature embedding and reduce the similarity between the defect feature vector set and the normal feature embedding; the artificial defect feature is embedded into the artificial defect feature of the deepest scale in the multi-scale artificial defect features obtained by encoding the artificial defect image through a student network; the normal feature is embedded as follows: encoding the normal image on the surface of the other Zhang Wenli by a pre-training teacher network, and fine-tuning the deepest scale feature in the multi-scale normal features obtained by encoding by a linear mapper in the space dimension;
the semantic segmentation loss is used for supervising a defect segmentation image output by a semantic segmentation branch in the double-branch decoding module to be close to the artificial defect mask image;
and the characteristic recovery loss is used for supervising the characteristic recovery branch in the double-branch decoding module to ensure that the output multi-scale pseudo normal characteristic approaches to the multi-scale normal characteristic.
4. A textured surface defect detection method according to claim 3, wherein the pixel level contrast decoupling distillation loss is expressed as:
wherein L is intra Representing the pixel level versus decoupling distillation loss, VN represents the number of normal feature vectors in the normal feature vector set,representing the ith normal feature vector, VD represents the number of defect feature vectors in the set of defect feature vectors,/or>Representing the j-th defect feature vector,/th defect feature vector>Representing the normal feature embedding and +.>Feature vectors located at the same spatial position +.>Representing the normal feature embedding and +.>The eigenvectors at the same spatial location, τ, represent the temperature parameters used to smooth the data distribution, sim (·) represent cosine similarity.
5. A textured surface defect detection method according to claim 3, wherein the pixel level contrast decoupling distillation loss is expressed as:
wherein L is c Representing the pixel level versus decoupling distillation loss,another normal feature embedding in random representing the same batch and +.>Feature vectors located at the same spatial position +.>Representing the other normal feature embedded and +.>Feature vectors located at the same spatial location.
6. The method for detecting defects on a textured surface according to claim 3, wherein the feature recovery loss is specifically a global feature mask perceived distillation loss, and is constructed by:
performing mask perception distillation on each scale feature in the multi-scale normal features by adopting a semantic mark to be trained together with a detection model, and generating a first feature level mask with a corresponding scale; performing mask perception distillation on each scale feature in the multi-scale pseudo normal features by adopting the trainable semantic mark to obtain a second feature level mask with a corresponding scale; fusing the first feature level mask and the second feature level mask under each scale to obtain a global perception mask under the scale;
sampling a dense supervision mode, and calculating an L2 distance between a normal feature and a pseudo normal feature in each scale in a space dimension, wherein the L2 distance is used for combining a global perception mask in the scale, and calculating a feature mask perception distillation loss in the scale; and synthesizing the feature mask perceived distillation loss under all scales to obtain the global feature mask perceived distillation loss.
7. The textured surface defect detection method of claim 6, wherein the feature mask perceived distillation loss at each scale is represented as:
wherein H is k Representing feature height, W, at the kth scale k Represents the feature width at the kth scale, M k Representing a global perceptual mask at a kth scale, D k Represent the kth rulerThe L2 distance at degrees.
8. A texture surface defect detection method according to claim 3 wherein the semantic segmentation penalty is expressed as:
in the method, in the process of the invention,representing the desire to calculate the value of the calculated value, I.I 1 Represents L1 norm, I m Representing the mask image of the artificial defect, I s Representing a defect segmented image, I d Representing the artificial defect image.
9. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program, when run by a processor, controls a device in which the storage medium is located to perform the textured surface defect detection method according to any of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311663492.8A CN117635585A (en) | 2023-12-06 | 2023-12-06 | Texture surface defect detection method based on teacher-student network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311663492.8A CN117635585A (en) | 2023-12-06 | 2023-12-06 | Texture surface defect detection method based on teacher-student network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117635585A true CN117635585A (en) | 2024-03-01 |
Family
ID=90021343
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311663492.8A Pending CN117635585A (en) | 2023-12-06 | 2023-12-06 | Texture surface defect detection method based on teacher-student network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117635585A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118154607A (en) * | 2024-05-11 | 2024-06-07 | 湖南大学 | Lightweight defect detection method based on mixed multiscale knowledge distillation |
CN118468230A (en) * | 2024-07-10 | 2024-08-09 | 苏州元瞰科技有限公司 | Glass defect detection algorithm based on multi-mode teacher and student framework |
-
2023
- 2023-12-06 CN CN202311663492.8A patent/CN117635585A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118154607A (en) * | 2024-05-11 | 2024-06-07 | 湖南大学 | Lightweight defect detection method based on mixed multiscale knowledge distillation |
CN118468230A (en) * | 2024-07-10 | 2024-08-09 | 苏州元瞰科技有限公司 | Glass defect detection algorithm based on multi-mode teacher and student framework |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117635585A (en) | Texture surface defect detection method based on teacher-student network | |
CN108230359A (en) | Object detection method and device, training method, electronic equipment, program and medium | |
CN107169417B (en) | RGBD image collaborative saliency detection method based on multi-core enhancement and saliency fusion | |
CN111626993A (en) | Image automatic detection counting method and system based on embedded FEFnet network | |
CN114565594A (en) | Image anomaly detection method based on soft mask contrast loss | |
WO2024021461A1 (en) | Defect detection method and apparatus, device, and storage medium | |
CN118230059B (en) | Abnormal state detection method for long-distance pipeline interior through correlation analysis of different spectrum data | |
CN114419323A (en) | Cross-modal learning and domain self-adaptive RGBD image semantic segmentation method | |
CN117451716A (en) | Industrial product surface defect detection method | |
CN117576079A (en) | Industrial product surface abnormality detection method, device and system | |
CN114820541A (en) | Defect detection method based on reconstructed network | |
Akhyar et al. | A beneficial dual transformation approach for deep learning networks used in steel surface defect detection | |
CN114463614A (en) | Significance target detection method using hierarchical significance modeling of generative parameters | |
CN111882545B (en) | Fabric defect detection method based on bidirectional information transmission and feature fusion | |
CN117876299A (en) | Multi-mode industrial anomaly detection method and system based on teacher-student network architecture | |
CN117611963A (en) | Small target detection method and system based on multi-scale extended residual error network | |
CN117078608A (en) | Double-mask guide-based high-reflection leather surface defect detection method | |
CN117372413A (en) | Wafer defect detection method based on generation countermeasure network | |
CN116977747A (en) | Small sample hyperspectral classification method based on multipath multi-scale feature twin network | |
Zhang et al. | Digital instruments recognition based on PCA-BP neural network | |
CN115035097A (en) | Cross-scene strip steel surface defect detection method based on domain adaptation | |
CN115100451A (en) | Data expansion method for monitoring oil leakage of hydraulic pump | |
CN112801955B (en) | Plankton detection method under unbalanced population distribution condition | |
CN114445649A (en) | Method for detecting RGB-D single image shadow by multi-scale super-pixel fusion | |
CN114842309B (en) | Optimized recognition method for familiar targets in familiar scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |