WO2023070911A1 - 一种基于自注意力的彩色纹理织物缺陷区域的检测方法 - Google Patents

一种基于自注意力的彩色纹理织物缺陷区域的检测方法 Download PDF

Info

Publication number
WO2023070911A1
WO2023070911A1 PCT/CN2021/139961 CN2021139961W WO2023070911A1 WO 2023070911 A1 WO2023070911 A1 WO 2023070911A1 CN 2021139961 W CN2021139961 W CN 2021139961W WO 2023070911 A1 WO2023070911 A1 WO 2023070911A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
image
swin
fabric
formula
Prior art date
Application number
PCT/CN2021/139961
Other languages
English (en)
French (fr)
Inventor
张宏伟
熊文博
张伟伟
张蕾
景军锋
Original Assignee
西安工程大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西安工程大学 filed Critical 西安工程大学
Publication of WO2023070911A1 publication Critical patent/WO2023070911A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30124Fabrics; Textile; Paper

Definitions

  • the invention belongs to the technical field of defect detection methods, and relates to a self-attention-based detection method for color texture fabric defect regions.
  • Colorful textured fabrics have beautiful and diverse patterns, and their sales have increased rapidly in recent years. They are not only used in clothing manufacturing, but also in industrial products. However, in its production process, due to irresistible factors, there will be defects on the surface of the fabric. At present, most enterprises use manual visual inspection to detect defects, but manual visual inspection will be affected by the degree of human eye fatigue, resulting in low efficiency and high missed detection rate. Therefore, there is a need for an accurate and fast automatic defect detection method for colored textured fabrics.
  • the fabric defect detection technology based on machine vision has received extensive attention from many researchers. According to different defect detection methods, it can be divided into traditional methods and deep learning methods.
  • Traditional detection methods can be divided into space-based, frequency-domain-based, model-based, and learning-based methods.
  • the traditional detection method is only for grayscale fabrics with simple texture, and cannot achieve good detection results for complex patterns.
  • Due to the powerful feature extraction and feature fusion capabilities of convolutional deep networks, deep learning methods have gradually become a research hotspot. Among them, supervised methods have achieved good detection results in specific fabric scenarios, but rely on a large number of defect samples and Defect regions marked manually.
  • the unsupervised color texture fabric defect detection method in deep learning mainly uses the difference between the input image to be tested and its corresponding reconstructed image to detect and locate defects. Specifically, it requires the model to have the ability to remove defective regions while preserving normal regions.
  • the deepening of the convolutional neural network often leads to overfitting of the model, which in turn leads to missed or over-detected defect areas, which cannot effectively solve the detection problem of color textured fabric defect areas.
  • the purpose of the present invention is to provide a self-attention-based detection method for color textured fabric defect areas, which solves the problem in the prior art that the over-fitting of the model is often caused by the deepening of the convolutional neural network level, which in turn leads to defect areas.
  • the missing or over-inspection of the color texture fabric cannot effectively solve the problem of detecting the defect area of the color texture fabric.
  • the technical scheme adopted in the present invention is a detection method based on a self-attention-based color texture fabric defect area, which is specifically implemented according to the following steps:
  • Step 1 establishing a color texture fabric data set including a color texture defect-free image, superimposing noise on the color texture defect-free image in the color texture fabric data set;
  • Step 2 build a Transformer-based Swin-Unet model
  • Step 3 input the non-defective image of color texture fabric with superimposed noise in step 1 into the Swin-Unet model based on Transformer constructed in step 2 for training, and obtain the trained Swin-Unet model based on Transformer;
  • Step 4 use the Transformer-based Swin-Unet model trained in step 3 to reconstruct the color texture fabric image to be tested, output the corresponding reconstructed image, and then judge and locate the defect area based on the reconstructed image.
  • the present invention is also characterized in that,
  • Step 1 is specifically:
  • Step 1.1 establish the color texture fabric dataset
  • the color texture fabric data includes the color texture fabric non-defective image training set and the color texture fabric defect image test set, all the images in the color texture fabric dataset are sorted into a size of 512 ⁇ 512 ⁇ 3 resolution, the image format is .jpg;
  • Step 1.2 superimpose noise on the color texture fabric defect-free images in the training set of color texture fabric defect-free images in step 1.1, as shown in formula (1):
  • X is the defect-free image of the color texture fabric
  • N(0,0.1) represents the Gaussian noise that obeys the standard normal distribution with the mean value of 0 and the variance of 0.1
  • It is a defect-free image of color textured fabric after superimposing noise.
  • the Transformer-based Swin-Unet model in step 2 is specifically:
  • the Swin-Unet model based on Transformer is a U-shaped symmetrical encoder-decoder structure based on Transformer, which is composed of encoder, bottleneck layer, and decoder connected in turn.
  • the input layer of the encoder is a defect-free image of superimposed noise color texture fabric
  • the output layer of the decoder is the reconstructed color textured fabric image
  • the encoder and decoder are connected to each other through three jumping layers.
  • the encoder consists of an input layer, a Patch Embedding layer, 3 Swin Transformer Block layers and 3 Patch Merging layers.
  • the Swin Transformer Block layer and the Patch Merging layer are alternately connected, and the Patch Embedding layer uses a convolution kernel of 4.
  • the convolution of 4 and filled with 0 is connected to the Swin Transformer Block layer.
  • the Swin Transformer Block layer uses the self-attention layer to connect to the Patch Merging layer after the Swin Transformer Block layer.
  • the self-attention layer is included in the Swin Transformer Block layer.
  • the Patch Merging layer uses the fully connected layer and the channel normalization operation to connect with the Swin Transformer Block layer after the Patch Merging layer, wherein the fully connected layer and the channel normalization operation are included in the Patch Merging layer, and the last of the encoder A Patch Merging layer is connected to the bottleneck layer;
  • the bottleneck layer is composed of two Swin Transformer Block layers connected sequentially, and the output layer of the encoder is connected to the first Swin Transformer Block layer of the bottleneck layer through a channel normalization operation, wherein the channel normalization operation is included in the encoder
  • the second Swin Transformer Block layer of the bottleneck layer is connected to the input layer of the decoder through a fully connected layer, wherein the fully connected layer is included in the second Swin Transformer Block layer;
  • the decoder is composed of 3 Patch Expanding layers, 3 Swin Transformer Block layers, Patch Projection layer, and output layer connections.
  • the first Patch Expanding layer of the decoder is connected to the second Swin Transformer Block layer of the bottleneck layer.
  • the Patch Expanding layer and the Swin Transformer Block layer are connected alternately.
  • the Patch Expanding layer is connected to the Swin Transformer Block layer by using the fully connected layer and the channel normalization operation.
  • the Swin Transformer Block layer is connected to the Patch Projection layer by using the self-attention layer.
  • the Patch Projection The layer uses a convolution with a convolution kernel of 1, a step size of 1, and a padding of 0 to connect to the output layer;
  • the three Swin Transformer Block layers of the encoder are connected to the three Swin Transformer Block layers of the decoder in one-to-one correspondence.
  • the Swin Transformer Block layer is composed of LayerNorm layer, window multi-head self-attention layer, shift window multi-head self-attention layer, and MLP layer.
  • LayerNorm layer is channel normalization operation, window multi-head self-attention layer and shift window multi-head self-attention layer
  • the force layer is composed of 2 fully connected layers, and the activation function Softmax is added after the fully connected layer.
  • the shift window multi-head self-attention layer adds shift and slice operations after the activation function Softmax.
  • the MLP layer consists of 2 fully connected layers.
  • the connection layer is composed, and the activation function GELU is added between the two fully connected layers, and the connection is as follows:
  • the input feature z l-1 first passes through the LayerNorm layer, then through the window multi-head self-attention layer, and then through the addition operation to obtain Then z l is obtained through LayerNorm layer, MLP layer and addition operation, and then obtained through LN layer, shift window multi-head self-attention layer and addition operation Finally, the output feature z l+1 is obtained through the LayerNorm layer, the MLP layer and the addition operation.
  • the process is as shown in formula (2):
  • LN() represents the output processed by the LayerNorm layer
  • MLP() represents the output processed by the MLP layer
  • W-MSA() represents the output processed by the window multi-head self-attention layer
  • SW-MSA() represents the output processed by the shifted window multi-head
  • the self-attention layer processes the output
  • the LayerNorm layer is the channel normalization operation.
  • the window multi-head self-attention layer and the shifted window multi-head self-attention layer calculate the self-attention Attention (Q, K, V) in each window, as shown in formula (3):
  • Q, K, and V represent the query matrix, key matrix, and value matrix, respectively, d represents the dimension of the matrix, B represents the bias matrix, and SoftMax is the activation function.
  • the number of MLP layer neurons in the first Swin Transformer Block layer of the encoder is 48 for the former and 192 for the latter.
  • the number of MLP layer neurons in the second Swin Transformer Block layer of the encoder is 96 for the former and 192 for the latter.
  • the number of neurons in the MLP layer in the third Swin Transformer Block layer of the encoder is 192 for the former and 768 for the latter, and the number of neurons in the MLP layer in the Swin Transformer Block layer of the bottleneck layer is 384 for the former and the latter respectively
  • the number of MLP layer neurons in each Swin Transformer Block layer of the decoder is equal to the corresponding number of neurons in the MLP layer of the encoder.
  • Step 3 is specifically:
  • Step 3.1 input the non-defective image of color texture fabric with superimposed noise into the Transformer-based Swin-Unet model constructed in step 2 to obtain a reconstructed image;
  • Step 3.2 calculate the mean square error loss for the reconstructed image obtained in step 3.1 and its corresponding color texture fabric image without superimposed noise, such as formula (4):
  • X(i) is the color textured fabric image corresponding to the reconstructed image without superimposed noise
  • n is the number of color textured fabric images without superimposed noise
  • L MSE is the loss function
  • step 3.3 minimize the LMSE as the optimization target parameter, use the AdamW optimizer to minimize the loss function, set the learning rate to 0.0001, set the maximum number of iterations to train the image, and obtain the trained Transformer-based Swin-Unet model.
  • Step 4 is specifically:
  • Step 4.1 input the color fabric image to be tested to the Transformer-based Swin-Unet model trained in step 3, and obtain the corresponding reconstructed image;
  • step 4.2 grayscale the input color fabric image to be tested and its corresponding reconstructed image, as shown in formula (5):
  • X Gray represents the image after grayscale
  • X r , X g , X b are the pixel values of RGB three different color channels corresponding to the color fabric image to be tested or the corresponding reconstructed image respectively;
  • Step 4.3 calculate the absolute value of the difference between the grayscale value of the corresponding pixel between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as in formula (6):
  • X Gray is the grayscale image of the fabric to be tested, is the reconstructed image corresponding to the grayscaled fabric image to be tested, and X Residual is the residual image;
  • Step 4.4 calculate the structural similarity between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as shown in formula (7):
  • ⁇ X and are the average gray values of the fabric image to be tested and the corresponding reconstructed image, that is, the average value of gray pixels, ⁇ X and are the gray standard deviations of the fabric image to be tested and the corresponding reconstructed image, respectively, is the covariance between the fabric image to be tested and the corresponding reconstructed image, C 1 and C 2 are constants to prevent the denominator from being 0,
  • the sliding window is moved on the image plane with a given step size, and the similarity of the overlapping regions is averaged to obtain the structural similarity image x ssim ;
  • Step 4.5 calculate the gradient magnitude similarity between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as shown in formula (8):
  • i is the position of the pixel value in the image
  • X GMS is the similarity of the gradient magnitude
  • c is a constant to prevent the denominator from being 0, and are the gradient magnitude images corresponding to the grayscaled fabric image to be tested and the grayscaled reconstructed image respectively, and the gradient magnitude image is defined as formula (9):
  • X Gray is the grayscaled image of the fabric to be tested
  • X Gray is the grayscale reconstructed image corresponding to the fabric image to be tested
  • h x and h y are the filtering of the Prewitt filter in the horizontal direction and vertical direction, respectively;
  • step 4.6 normalize the gradient magnitude similarity map obtained in step 4.5, as shown in formula (10):
  • step 4.7 perform point product fusion on the residual image obtained in step 4.3, the structural similarity image obtained in step 4.4, and the normalized gradient magnitude similarity image obtained in step 4.6, as shown in formula (11):
  • X Residual is the residual image
  • X SSIM is the structural similarity image
  • X Fusion is the fusion image after multiplication fusion
  • step 4.8 add the fused image obtained in step 4.7 to Gaussian filtering, and use the Gaussian convolution kernel to perform a sliding window operation on the image to obtain the filtered image, as shown in formula (12):
  • X Fusion is the fused image
  • X Fusion&Gaussian is the fused image after Gaussian filtering
  • * is the sliding window convolution operation
  • G(x,y) is the Gaussian kernel function, as shown in formula (13):
  • (x, y) is the pixel coordinates of the fused image
  • ⁇ x and ⁇ y are the pixel standard deviations of the fused image in the direction of the x-axis and y-axis, respectively;
  • Step 4.9 the Gaussian-filtered fused image obtained in step 4.8 is used to determine the threshold using an adaptive threshold method, and binarization is performed to obtain a binary image, such as formula (14):
  • p is the pixel value of the binarized image
  • T is the image adaptive threshold
  • ⁇ and ⁇ are the mean and variance of the fused image after Gaussian filtering, respectively
  • is the coefficient of the variance, if the pixel at a certain point in the image If the value is lower than the image adaptive threshold, the pixel value is set to logic 0, otherwise it is set to logic 1;
  • step 4.10 perform a closing operation on the binarized image obtained in step 4.9 to obtain the final detection result image, wherein the closing operation is as in formula (15):
  • X binary is the binarized image obtained in step 4.9
  • E is a 3 ⁇ 3 closed operation structural element, for the image dilation operation,
  • X Closing is the final detection result image
  • step 4.11 the final detection result image obtained in step 4.10 is used to detect whether the defect exists and locate the defect area. If there is a white area with a pixel value of 255 in the final detection result image, it can be determined that there is a defect in the color texture fabric image to be detected. The area is where the white area is located.
  • step 4.5 the size of the Prewitt filter is 3 ⁇ 3, and its filtering parameters in the horizontal direction and vertical direction are respectively.
  • the point multiplication fusion in step 4.7 is an element-wise multiplication between the three matrices
  • the size of the Gaussian convolution kernel in step 4.8 is 3 ⁇ 3
  • the parameter ⁇ in the adaptive threshold method in step 4.9 is set to 3.5 based on experience .
  • the model constructed by the present invention can effectively reconstruct the color textured fabric without defect samples and manual marking in the training stage, by calculating the difference between the color fabric image to be tested and the corresponding reconstructed image, and combining the proposed
  • the post-processing methods of dot product fusion, adaptive threshold, and closed operation can reduce missed or over-inspected defect areas.
  • the detection accuracy and speed of this method can meet the technical requirements of the production and detection of colored textured fabrics, and provide an automatic defect detection scheme that is easy for engineering practice in the actual garment industry.
  • Fig. 1 is a kind of non-defective sample figure in the color texture fabric training set in a kind of detection method of the color texture fabric defect region based on self-attention of the present invention
  • Fig. 2 is a partial defect sample figure of the color texture fabric test set in a kind of detection method of the color texture fabric defect region based on self-attention of the present invention
  • Fig. 3 is a kind of Swin-Unet model structural diagram in the detection method of the color texture fabric defect region based on self-attention of the present invention
  • Fig. 4 is the structural diagram of Swin Transformer Block layer in a kind of detection method based on self-attention color texture fabric defective region of the present invention
  • Fig. 5 is a schematic flow chart of step 3 in the detection method of a self-attention-based color texture fabric defect region of the present invention
  • Fig. 6 is a schematic flow chart of step 4 in the detection method of a self-attention-based color texture fabric defect region of the present invention.
  • Fig. 7 is a comparison diagram of the detection results between the Swin-Unet model and the UDCAE model used in the experiment in a self-attention-based detection method for color texture fabric defect regions of the present invention.
  • the present invention is a kind of detection method based on self-attention color texture fabric defect area, specifically implements according to the following steps:
  • Step 1 Establish a color texture fabric dataset including color texture defect-free images, and superimpose noise on the color texture defect-free images in the color texture fabric dataset; specifically:
  • Step 1.1 establish the color texture fabric data set
  • the color texture fabric data includes the color texture fabric non-defective image training set and the color texture fabric defect image test set as shown in Figure 1 and Figure 2
  • Figure 1 is the color texture fabric training set
  • Figure 2 is a partial defect image in the color texture fabric test set
  • all the images in the color texture fabric dataset are sorted into a resolution of 512 ⁇ 512 ⁇ 3, and the image format is .jpg file
  • the dataset is prepared in total Four kinds of non-defective and defective images of color textured fabrics were obtained, namely SP3, SP5, SP24 and CL1;
  • Step 1.2 superimpose noise on the color texture fabric defect-free images in the training set of color texture fabric defect-free images in step 1.1, as shown in formula (1):
  • X is the defect-free image of the color texture fabric
  • N(0,0.1) represents the Gaussian noise that obeys the standard normal distribution with the mean value of 0 and the variance of 0.1
  • It is a defect-free image of color textured fabric after superimposing noise.
  • Step 2 build a Transformer-based Swin-Unet model, specifically:
  • the Transformer-based Swin-Unet model is a Transformer-based U-shaped symmetric encoder-decoder structure, which is composed of an encoder, a bottleneck layer, and a decoder connected in sequence, and the input layer of the encoder is superimposed noise
  • the color textured fabric has no defect image
  • the output layer of the decoder is the reconstructed color textured fabric image
  • the encoder and decoder are connected to each other through 3 jumping layers.
  • the encoder consists of an input layer, a Patch Embedding layer, 3 Swin Transformer Block layers and 3 Patch Merging layers.
  • the Swin Transformer Block layer and the Patch Merging layer are alternately connected, and the Patch Embedding layer uses a convolution kernel of 4.
  • the convolution of 4 and filled with 0 is connected to the Swin Transformer Block layer.
  • the Swin Transformer Block layer uses the self-attention layer to connect to the Patch Merging layer after the Swin Transformer Block layer.
  • the self-attention layer is included in the Swin Transformer Block layer.
  • the self-attention layer can be composed of the window multi-head self-attention layer (W-MSA) and the shifted window multi-head self-attention layer (SW-MSA) in the Swin Transformer Block layer.
  • W-MSA window multi-head self-attention layer
  • SW-MSA shifted window multi-head self-attention layer
  • the Patch Merging layer uses the fully connected layer and The channel normalization operation is connected to the Swin Transformer Block layer after the Patch Merging layer. Among them, the fully connected layer and the channel normalization operation are included in the Patch Merging layer. Composed of one layer, the last Patch Merging layer of the encoder is connected to the bottleneck layer;
  • the bottleneck layer is composed of two Swin Transformer Block layers connected in sequence.
  • the output layer of the encoder is connected to the first Swin Transformer Block layer of the bottleneck layer through the channel normalization operation, where the channel normalization operation is included in the output of the encoder In the layer, the second Swin Transformer Block layer of the bottleneck layer is connected to the input layer of the decoder through a fully connected layer, wherein the fully connected layer is included in the second Swin Transformer Block layer;
  • the decoder is composed of 3 Patch Expanding layers, 3 Swin Transformer Block layers, Patch Projection layer, and output layer connections.
  • the first Patch Expanding layer of the decoder is connected to the second Swin Transformer Block layer of the bottleneck layer.
  • the Patch Expanding layer and the Swin Transformer Block layer are connected alternately.
  • the Patch Expanding layer is connected to the Swin Transformer Block layer by using the fully connected layer and the channel normalization operation.
  • the Swin Transformer Block layer is connected to the Patch Projection layer by using the self-attention layer.
  • the Patch Projection The layer uses a convolution with a convolution kernel of 1, a step size of 1, and a padding of 0 to connect to the output layer;
  • the three Swin Transformer Block layers of the encoder are connected to the three Swin Transformer Block layers of the decoder in one-to-one correspondence.
  • the Swin Transformer Block layer is the basic unit of the model. As shown in Figure 4, the Swin Transformer Block layer consists of a LayerNorm (LN) layer, a window multi-head self-attention layer (W-MSA), a shifted window multi-head self-attention layer (SW- MSA), MLP layer, among them, the LayerNorm layer is a channel normalization operation, W-MSA and SW-MSA layers are composed of two fully connected layers, and the activation function Softmax, SW-MSA is added after the fully connected layer The layer adds shift and slice operations after the activation function Softmax.
  • the MLP layer consists of 2 fully connected layers, and the activation function GELU is added between the 2 fully connected layers:
  • the input feature z l-1 first passes through the LayerNorm layer, then through the window multi-head self-attention layer, and then through the addition operation to obtain Then z l is obtained through LayerNorm layer, MLP layer and addition operation, and then obtained through LN layer, shift window multi-head self-attention layer and addition operation Finally, the output feature z l+1 is obtained through the LayerNorm layer, the MLP layer and the addition operation.
  • the process is as shown in formula (2):
  • LN() represents the output processed by the LayerNorm layer
  • MLP() represents the output processed by the MLP layer
  • W-MSA() represents the output processed by the window multi-head self-attention layer
  • SW-MSA() represents the output processed by the shifted window multi-head
  • the self-attention layer processes the output
  • the LayerNorm layer is the channel normalization operation.
  • the window multi-head self-attention layer and the shifted window multi-head self-attention layer calculate the self-attention Attention (Q, K, V) in each window, as shown in formula (3):
  • Q, K, and V represent the query matrix, key matrix, and value matrix, respectively, d represents the dimension of the matrix, B represents the bias matrix, and SoftMax is the activation function.
  • the number of MLP layer neurons in the first Swin Transformer Block layer of the encoder is 48 for the former and 192 for the latter.
  • the number of MLP layer neurons in the second Swin Transformer Block layer of the encoder is 96 for the former and 192 for the latter.
  • the number of neurons in the MLP layer in the third Swin Transformer Block layer of the encoder is 192 for the former and 768 for the latter, and the number of neurons in the MLP layer in the Swin Transformer Block layer of the bottleneck layer is 384 for the former and the latter respectively
  • the number of MLP layer neurons in each Swin Transformer Block layer of the decoder is equal to the corresponding number of neurons in the MLP layer of the encoder.
  • Step 3 as shown in Figure 5, input the non-defective image of color texture fabric with superimposed noise in step 1 into the Swin-Unet model based on Transformer constructed in step 2 for training, and obtain the trained Swin-Unet model based on Transformer; Specifically:
  • Step 3.1 input the non-defective image of color texture fabric with superimposed noise into the Transformer-based Swin-Unet model constructed in step 2 to obtain a reconstructed image;
  • Step 3.2 calculate the mean square error loss for the reconstructed image obtained in step 3.1 and its corresponding color texture fabric image without superimposed noise, such as formula (4):
  • X(i) is the color textured fabric image corresponding to the reconstructed image without superimposed noise
  • n is the number of color textured fabric images without superimposed noise
  • L MSE is the loss function
  • step 3.3 minimize the LMSE as the optimization target parameter, use the AdamW optimizer to minimize the loss function, set the learning rate to 0.0001, set the maximum number of iterations to train the image, and obtain the trained Transformer-based Swin-Unet model.
  • Step 4 use the Transformer-based Swin-Unet model trained in step 3 to reconstruct the color texture fabric image to be tested, output the corresponding reconstructed image, and then judge and locate the defect area based on the reconstructed image, Specifically:
  • Step 4.1 input the color fabric image to be tested to the Swin-Unet model based on Transformer trained in step 3, and obtain the corresponding reconstructed image;
  • step 4.2 grayscale the input color fabric image to be tested and its corresponding reconstructed image, as shown in formula (5):
  • X Gray represents the image after grayscale
  • X r , X g , X b are the pixel values of RGB three different color channels corresponding to the color fabric image to be tested or the corresponding reconstructed image respectively;
  • Step 4.3 calculate the absolute value of the difference between the grayscale value of the corresponding pixel between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as in formula (6):
  • X Gray is the grayscale image of the fabric to be tested, is the reconstructed image corresponding to the grayscaled fabric image to be tested, and X Residual is the residual image;
  • Step 4.4 calculate the structural similarity between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as shown in formula (7):
  • ⁇ X and are the average gray values of the fabric image to be tested and the corresponding reconstructed image, that is, the average value of gray pixels, ⁇ X and are the gray standard deviations of the fabric image to be tested and the corresponding reconstructed image, respectively, is the covariance between the fabric image to be tested and the corresponding reconstructed image, C 1 and C 2 are constants to prevent the denominator from being 0,
  • the sliding window is moved on the image plane with a given step size, and the similarity of the overlapping regions is averaged to obtain the structural similarity image x ssim ;
  • Step 4.5 calculate the gradient magnitude similarity between the grayscaled fabric image to be tested and the corresponding reconstructed image in step 4.2, as shown in formula (8):
  • i is the position of the pixel value in the image
  • X GMS is the similarity of the gradient magnitude
  • c is a constant to prevent the denominator from being 0, and are the gradient magnitude images corresponding to the grayscaled fabric image to be tested and the grayscaled reconstructed image respectively, and the gradient magnitude image is defined as formula (9):
  • X Gray is the grayscaled image of the fabric to be tested
  • X Gray is the grayscale reconstructed image corresponding to the fabric image to be tested
  • h x and h y are the filtering of the Prewitt filter in the horizontal direction and vertical direction, respectively;
  • step 4.6 normalize the gradient magnitude similarity map obtained in step 4.5, as shown in formula (10):
  • step 4.7 the residual image obtained in step 4.3, the structural similarity image obtained in step 4.4, and the normalized gradient magnitude similarity image obtained in step 4.6 are subjected to point product fusion, that is, the step-by-step Element multiplication, such as formula (11):
  • X Residual is the residual image
  • X SSIM is the structural similarity image
  • X Fusion is the fusion image after multiplication fusion
  • step 4.8 add the fused image obtained in step 4.7 to Gaussian filtering, and use the Gaussian convolution kernel to perform a sliding window operation on the image to obtain the filtered image, as shown in formula (12):
  • X Fusion is the fused image
  • X Fusion&Gaussian is the fused image after Gaussian filtering
  • * is the sliding window convolution operation
  • G(x, y) is the Gaussian kernel function
  • the size of the Gaussian convolution kernel is 3 ⁇ 3, such as Formula (13):
  • (x, y) is the pixel coordinates of the fused image
  • ⁇ x and ⁇ y are the pixel standard deviations of the fused image in the direction of the x-axis and y-axis, respectively;
  • Step 4.9 the Gaussian-filtered fused image obtained in step 4.8 is used to determine the threshold using an adaptive threshold method, and binarization is performed to obtain a binary image, such as formula (14):
  • p is the pixel value of the binarized image
  • T is the image adaptive threshold
  • ⁇ and ⁇ are the mean and variance of the fused image after Gaussian filtering, respectively
  • step 4.10 perform a closing operation on the binarized image obtained in step 4.9 to obtain the final detection result image, wherein the closing operation is as in formula (15):
  • X binary is the binarized image obtained in step 4.9
  • E is a 3 ⁇ 3 closed operation structural element, for the image dilation operation,
  • X Closing is the final detection result image
  • step 4.11 the final detection result image obtained in step 4.10 is used to detect whether the defect exists and locate the defect area. If there is a white area with a pixel value of 255 in the final detection result image, it can be determined that there is a defect in the color texture fabric image to be detected. The area is where the white area is located.
  • a detection method for the defect area of the colored textured fabric of the present invention is described below with specific examples:
  • the hardware environment configuration is Intel(R) Core(TM) i7-6850K CPU; the graphics card is GeForce RTX 3090 (24G); the memory is 128G.
  • the software configuration is: the operating system is Ubuntu 18.04.5LTS; the deep learning framework is PyTorch1.7.1; the environment is based on Anaconda3 and Python3.6.2.
  • the comprehensive evaluation index (F1-measure, F1) and the average intersection-over-union ratio (IoU) in the pixel-level evaluation index are used as the evaluation index.
  • F1-measure can evaluate the detection performance more comprehensively.
  • IoU indicates the closeness between the detected defect area and the real defect area.
  • the evaluation index is defined as formula (16-17):
  • TP represents the number of pixels successfully detected in the defective region
  • FP represents the number of pixels in the non-defective region that were misdetected as defective regions
  • FN represents the number of pixels in the defective region that were not detected.
  • a dataset of colored textured fabrics is established, including a training set of non-defective images of colored textured fabrics and a test set of defective images; secondly, a Transformer-based Swin-Unet model is constructed; then, the model is trained so that the model has weight The ability to construct a normal sample and repair the defect area; finally, to detect the defect of the color textured fabric image to be tested, by calculating the difference between the color textured fabric image to be tested and the corresponding reconstructed image, combined with the proposed post-processing method, to realize the Detection and localization of defect areas.
  • a self-attention-based detection method for color textured fabric defect areas proposed by the present invention is essentially a Transformer-based Swin-Unet model, without the need for defect samples and manual marking.
  • the constructed unsupervised model can effectively reconstruct the normal sample and repair the defect area.
  • the defect area can be quickly and accurately detection and localization.
  • This method does not require a large number of manually labeled defect samples, and can effectively avoid practical problems such as the scarcity of defect samples, the imbalance of defect types, and the high cost of artificially constructed features.
  • the experimental results show that the detection accuracy and speed of this method can meet the technical requirements of the production detection of colored textured fabrics, and provide an automatic defect detection scheme that is easy for engineering practice in the actual garment industry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

一种基于自注意力的彩色纹理织物缺陷区域的检测方法,具体为:建立包括彩色纹理无缺陷图像的彩色纹理织物数据集,对彩色纹理织物数据集中的彩色纹理无缺陷图像叠加噪声;构建基于Transformer的Swin-Unet模型并训练得到训练好的模型;使用训练好的模型对待测彩色纹理织物图像进行重构,输出对应的重构图像,然后根据重构图像判断并定位缺陷区域。该方法能够有效地解决彩色纹理织物缺陷区域检测的问题。

Description

一种基于自注意力的彩色纹理织物缺陷区域的检测方法 技术领域
本发明属于缺陷检测方法技术领域,涉及一种基于自注意力的彩色纹理织物缺陷区域的检测方法。
背景技术
彩色纹理织物花型美观多样,近年来销量与日剧增,其不仅被用于服装制造,也被用于工业产品。然而,在其生产过程中,由于不可抗拒的因素影响会导致织物表面存在缺陷。目前,大多企业采用人工目测的方法对其进行缺陷检测,但人工目测会受人眼疲劳程度的影响导致效率低漏检率高。因此,需要一种准确且快速的彩色纹理织物自动缺陷检测方法。
当前,基于机器视觉的织物缺陷检测技术受到了许多研究人员的广泛关注。根据缺陷检测方法的不同,可分为传统方法和深度学习方法。传统检测方法可分为基于空间的、基于频域的、基于模型的、基于学习的方法。传统检测方法只针对纹理简单的灰度织物布匹,无法对复杂花型取得良好的检测效果。由于卷积深网络具有强大的特征提取和特征融合的能力,深度学习方法逐渐成为研究热点,其中有监督的方法在特定的织物场景下取得了良好的检测效果,但依赖于大量的缺陷样本和人工标记的缺陷区域。由于小批量生产的彩色纹理织物缺陷样本稀缺且缺陷种类不平衡,难以构建缺陷种类完备的彩色纹理织物数据库。因此,深度学习中有监督的方法无法解决彩色纹理织物缺陷检测问题。深度学习中无监督的方法由于无需缺陷样本且不需人工标记受到了部分研究人员的关注。深度学习中无监督的彩色纹理织物缺陷检测方法主要利用输入待测图像与其对应的重构图像之间的差异进行缺陷的检测和定位。具体来说,它要求模型具有去除缺陷区域且保留正常区域的能力。但在实际情况中,由于卷积神经网络层次的加深往往导致模型的过拟合,进而导致缺陷区域的漏检或过检,从而无法有效地解决彩色纹理织物缺陷区域的检测问题。
发明内容
本发明的目的是提供一种基于自注意力的彩色纹理织物缺陷区域的检测方法,解决了现有技术中存在的由于卷积神经网络层次的加深往往导致模型的过拟合,进而导致缺陷区域的漏检或过检,从而无法有效地解决彩色纹理织物缺陷区域检测的问题。
本发明所采用的技术方案是,一种基于自注意力的彩色纹理织物缺陷区域的检测方法,具体按照如下步骤实施:
步骤1,建立包括彩色纹理无缺陷图像的彩色纹理织物数据集,对彩色纹理织物数据集中的彩色纹理无缺陷图像叠加噪声;
步骤2,构建基于Transformer的Swin-Unet模型;
步骤3,将步骤1叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中进行训练,得到训练好的基于Transformer的Swin-Unet模型;
步骤4,使用步骤3训练好的基于Transformer的Swin-Unet模型对待测彩色纹理织物图像进行重构,输出对应的重构图像,然后根据重构图像判断并定位缺陷区域。
本发明的特征还在于,
步骤1具体为:
步骤1.1,建立彩色纹理织物数据集,彩色纹理织物数据包括彩色纹理织物无缺陷图像训练集和彩色纹理织物有缺陷图像测试集,彩色纹理织物数据集中的所有图像均整理成512×512×3大小的分辨率,图像格式均为.jpg;
步骤1.2,对步骤1.1中的彩色纹理织物无缺陷图像训练集中的彩色纹理织物无缺陷图像叠加噪声,如式(1)所示:
Figure PCTCN2021139961-appb-000001
式中,X为彩色纹理织物无缺陷图像,N(0,0.1)表示服从均值为0、方差为0.1的标准正态分布的高斯噪声,
Figure PCTCN2021139961-appb-000002
为叠加噪声后的彩色纹理织物无缺陷图像。步骤2中的基于Transformer的Swin-Unet模型具体为:
基于Transformer的Swin-Unet模型为一个基于Transformer的U型对称编码器-解码器结构,分别由编码器、瓶颈层、解码器依次连接组成,编码器的输入层为叠加噪声彩色纹理织物无缺陷图像,解码器的输出层为重构后的彩色纹理织物图像,编码器和解码器之间通过3个跳接层相互连接。
编码器由输入层、Patch Embedding层、3个Swin Transformer Block层和3个Patch Merging层连接组成,其中,Swin Transformer Block层和Patch Merging层交替相连,Patch Embedding层利用卷积核为4、步长为4、填充为0的卷积与Swin Transformer Block层相连,Swin Transformer Block层利用自注意力层与在Swin Transformer Block层之后的Patch Merging层相连,其中,自注意力层包括在Swin Transformer Block层中,Patch Merging层利用全连接层和通道归一化操作与在Patch Merging层之后的Swin Transformer Block层相连,其中,全连接层和通道归一化操作包括在Patch Merging层中,编码器的最后一个Patch Merging层与瓶颈层连接;
瓶颈层由2个Swin Transformer Block层依次连接组成,编码器的输出层通过通道归一化操作连接到瓶颈层的第一个Swin Transformer Block层,其中,通道归一化操作包括在所述编码器的输出层中,瓶颈层的第二个Swin Transformer Block层通过全连接层连接到解码器的输入层,其中,全连接层包括在所述第二个Swin Transformer Block层中;
解码器由3个Patch Expanding层、3个Swin Transformer Block层、Patch Projection层、输出层连接组成,解码器的第一个Patch Expanding层与瓶颈层的第二个Swin Transformer Block层连接,在解码器中,Patch Expanding层和Swin Transformer Block层交替相连,Patch Expanding层利用全连接层和通道归一化操作与Swin Transformer Block层相连,Swin Transformer Block层利用自注意力层与Patch Projection层连接,Patch Projection层利用卷积核为1、步长为1、填充为0的卷积与输出层相连;
编码器的3个Swin Transformer Block层与解码器的3个Swin Transformer Block层一一对应连接。
Swin Transformer Block层由LayerNorm层、窗口多头自注意力层、移位窗口多头自注意力层、MLP层组成,LayerNorm层是通道归一化操作,窗口多头自注意力层和移位窗口多头自注意力层都由2个全连接层组成,并在全连接层后加入了激活函数Softmax,移位窗口多头自注意力层在激活函数Softmax后加入了移位和切片操作,MLP层由2个全连接层组成,并在2个全连接层间加入了激活函数GELU,按照以下方式连接:
输入特征z l-1先经过LayerNorm层,再经过窗口多头自注意力层,接着经过相加操作得到
Figure PCTCN2021139961-appb-000003
接着经过LayerNorm层、MLP层和相加操作得到z l,然后再经过LN层、移位窗口多头自注意力层和相加操作得到
Figure PCTCN2021139961-appb-000004
最后经过LayerNorm层、MLP层和相加操作得到输出特征z l+1,过程如式(2):
Figure PCTCN2021139961-appb-000005
式中,LN()表示经过LayerNorm层处理输出,MLP()表示经过MLP层处理输出,W-MSA()表示经过窗口多头自注意力层处理输出、SW-MSA()表示经过移位窗口多头自注意力层处理输出,LayerNorm层为通道归一化操作。
窗口多头自注意力层和移位窗口多头自注意力层是在每个窗口中计算自注意力Attention(Q,K,V),如式(3):
Figure PCTCN2021139961-appb-000006
式中,Q、K、V分别表示查询矩阵、键矩阵、值矩阵,d表示矩阵的维度,B表示偏置矩阵,SoftMax为激活函数。
编码器的第一个Swin Transformer Block层中MLP层神经元个数前者为48、后者为192,编码器的第二个Swin Transformer Block层中MLP层神经元个数前者为96、后者为384,编码器的第三个Swin Transformer Block层中MLP层神经元个数前者为192、后者为768,瓶颈层的Swin Transformer Block层中MLP层神经元个数前者和后者分别都为384和1536,解码器的每个Swin Transformer Block层中MLP层神经元个数与编码器的MLP层中相对应的神经元个数相等。
步骤3具体为:
步骤3.1,将叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中得到重构图像;
步骤3.2,对步骤3.1得到的重构图像与其对应未叠加噪声的彩色纹理织物图像计算均方差损失,如式(4):
Figure PCTCN2021139961-appb-000007
式中,
Figure PCTCN2021139961-appb-000008
为重构图像,X(i)为重构图像对应未叠加噪声的彩色纹理织物图像,n为未叠加噪声的彩色纹理织物图像的个数,L MSE为损失函数;
步骤3.3,以最小化L MSE为优化目标参数,采用AdamW优化器使损失函数达到最小,学习率为0.0001,设置最大迭代次数对图像进行训练,得到训练好的基于Transformer的Swin-Unet模型。
步骤4具体为:
步骤4.1,将待测彩色织物图像输入到步骤3训练好的基于Transformer的Swin-Unet模型,得到对应的重构图像;
步骤4.2,将输入的待测彩色织物图像和其对应的重构图像分别进行灰度化,如式(5):
X Gray=0.2125·X r+0.7154·X g+0.0721·X b       (5)
式中,X Gray表示灰度化后的图像;X r、X g、X b分别为待测彩色织物图像或对应的重构图像对应的RGB三个不同颜色通道下的像素值;
步骤4.3,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间对应像素点灰度值之差的绝对值,如式(6):
Figure PCTCN2021139961-appb-000009
式中,X Gray为灰度化后的待测织物图像,
Figure PCTCN2021139961-appb-000010
为灰度化后的待测织物图像对应的重构图像,X Residual为残差图像;
步骤4.4,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间的结构相似性,如式(7):
Figure PCTCN2021139961-appb-000011
式中,μ X
Figure PCTCN2021139961-appb-000012
分别为待测织物图像和对应重构图像的灰度平均值,即灰度像素平均值,σ X
Figure PCTCN2021139961-appb-000013
分别为待测织物图像和对应重构图像的灰度标准差,
Figure PCTCN2021139961-appb-000014
为待测织物图像和对应重构图像之间的协方差,C 1和C 2是防止分母为0的常数,
Figure PCTCN2021139961-appb-000015
为从亮度、对比度和结构信息这三个方面衡量两幅图像之间的相似性,以给定的步长在图像平面上移动滑动窗口,对重叠区域的相似性取平均值,得到结构相似性图像X SSIM
步骤4.5,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间梯度幅值相似性,如式(8):
Figure PCTCN2021139961-appb-000016
式中,i为图像中像素值的位置,X GMS为梯度幅值相似性,c为防止分母为0的常数,
Figure PCTCN2021139961-appb-000017
Figure PCTCN2021139961-appb-000018
分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,梯度幅值图像定义如式(9):
Figure PCTCN2021139961-appb-000019
式中,
Figure PCTCN2021139961-appb-000020
为卷积操作,X Gray为灰度化后的待测织物图像,
Figure PCTCN2021139961-appb-000021
为与待测织物图像对应的灰度 化后的重构图像,
Figure PCTCN2021139961-appb-000022
Figure PCTCN2021139961-appb-000023
分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,h x和h y分别为Prewitt滤波器在水平方向和垂直方向上的滤波;
基于上述式(8)计算的每个像素点的梯度幅值相似性,可组成梯度幅值相似性图;
步骤4.6,将步骤4.5得到的梯度幅值相似性图采用归一化操作,如式(10):
Figure PCTCN2021139961-appb-000024
式中,
Figure PCTCN2021139961-appb-000025
为梯度幅值相似性图中最小的像素值,
Figure PCTCN2021139961-appb-000026
为梯度幅值相似性图中最大的像素值,
Figure PCTCN2021139961-appb-000027
为归一化后的梯度幅值相似性图;
步骤4.7,将步骤4.3得到的残差图像和步骤4.4得到的结构相似性图像以及步骤4.6得到的归一化后的梯度幅值相似性图像进行点乘融合,如式(11):
Figure PCTCN2021139961-appb-000028
式中,X Residual为残差图像,X SSIM为结构相似性图像,
Figure PCTCN2021139961-appb-000029
为归一化后的梯度幅值相似性图像,X Fusion为乘法融合后的融合图像;
步骤4.8,将步骤4.7得到的融合图像加入高斯滤波,采用高斯卷积核对图像进行滑窗操作,得到滤波后的图像,如式(12):
X Fusion&Gaussian=X Fusion*G(x,y)    (12)
式中,X Fusion为融合图像,X Fusion&Gaussian为经过高斯滤波后的融合图像,*为滑窗卷积操作,G(x,y)为高斯核函数,如式(13):
Figure PCTCN2021139961-appb-000030
式中,(x,y)为融合图像的像素坐标,σ x和σ y分别为融合图像的x轴、y轴方向的像素标准差;
步骤4.9,将步骤4.8得到的经过高斯滤波后的融合图像采用自适应阈值的方法确定阈值,并进行二值化处理得到二值图像,如式(14):
Figure PCTCN2021139961-appb-000031
式中,p为二值化图像的像素值,T为图像自适应阈值,μ和σ分别为经过高斯滤波后的融合图像的均值和方差,ε为方差的系数,若图像中某一点的像素值低于图像自适应阈值,像素值置为逻辑0,反之置为逻辑1;
步骤4.10,将经过步骤4.9得到的二值化图像进行闭运算操作得到最终检测结果图像,其中闭运算操作如式(15):
Figure PCTCN2021139961-appb-000032
式中,X binary为步骤4.9得到的二值化图像,E为3×3的闭运算结构元素,
Figure PCTCN2021139961-appb-000033
为图像膨胀操作,
Figure PCTCN2021139961-appb-000034
为图像腐蚀操作,X Closing为最终检测结果图像;
步骤4.11,将步骤4.10得到的最终检测结果图像来检测缺陷是否存在和定位缺陷区域,若最终检测结果图像存在像素值为255的白色区域,即可判定待检测的彩色纹理织物图像存在缺陷,缺陷区域为白色区域所在的位置。
步骤4.5中Prewitt滤波器大小为3×3,其在水平方向和垂直方向的滤波参数分别为
Figure PCTCN2021139961-appb-000035
步骤4.7中的点乘融合为三个矩阵之间的逐元素相乘,步骤4.8中的高斯卷积核大小为3×3,步骤4.9中自适应阈值的方法中的参数ε根据经验设置为3.5。
本发明的有益效果是:
本发明在训练阶段无需缺陷样本和不需人工标记的情况下,所构建的模型能有效重构彩色纹理织物,通过计算待测彩色织物图像和对应重构图像之间的差异,并结合所提出的点乘融合、自适应阈值、闭运算操作的后处理方法,减少对缺陷区域漏检或过检。该方法的检测精度和速度能够满足彩色纹理织物生产检测的工艺要求,为实际服装行业提供了一种易于工程实践的自动缺陷检测方案。
附图说明
图1是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中彩色纹理织物训练集中的部分无缺陷样本图;
图2是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中彩色纹理织物测试集中的部分缺陷样本图;
图3是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中Swin-Unet模型结构图;
图4是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中Swin Transformer Block层的结构图;
图5是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中步骤3的流程示意图;
图6是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中步骤4的流程示意图;
图7是本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法中实验所用Swin-Unet模型与UDCAE模型进行检测的结果对比图。
具体实施方式
下面结合附图和具体实施方式对本发明进行详细说明。
本发明一种基于自注意力的彩色纹理织物缺陷区域的检测方法,具体按照如下步骤实施:
步骤1,建立包括彩色纹理无缺陷图像的彩色纹理织物数据集,对彩色纹理织物数据集中的彩色纹理无缺陷图像叠加噪声;具体为:
步骤1.1,建立彩色纹理织物数据集,彩色纹理织物数据包括彩色纹理织物无缺陷图像训练集和彩色纹理织物有缺陷图像测试集如图1和图2所示,图1为彩色纹理织物训练集中的部分无缺陷图像,图2为彩色纹理织物测试集中的部分缺陷图像,彩色纹理织物数据集中的所有图像均整理成512×512×3大小的分辨率,图像格式为.jpg文件;数据集共准备了4种不同的彩色纹理织物无缺陷图像和有 缺陷图像,分别为SP3、SP5、SP24和CL1;
步骤1.2,对步骤1.1中的彩色纹理织物无缺陷图像训练集中的彩色纹理织物无缺陷图像叠加噪声,如式(1)所示:
Figure PCTCN2021139961-appb-000036
式中,X为彩色纹理织物无缺陷图像,N(0,0.1)表示服从均值为0、方差为0.1的标准正态分布的高斯噪声,
Figure PCTCN2021139961-appb-000037
为叠加噪声后的彩色纹理织物无缺陷图像。
步骤2,构建基于Transformer的Swin-Unet模型,具体为:
如图3所示,基于Transformer的Swin-Unet模型为一个基于Transformer的U型对称编码器-解码器结构,分别由编码器、瓶颈层、解码器依次连接组成,编码器的输入层为叠加噪声彩色纹理织物无缺陷图像,解码器的输出层为重构后的彩色纹理织物图像,编码器和解码器之间通过3个跳接层相互连接。
编码器由输入层、Patch Embedding层、3个Swin Transformer Block层和3个Patch Merging层连接组成,其中,Swin Transformer Block层和Patch Merging层交替相连,Patch Embedding层利用卷积核为4、步长为4、填充为0的卷积与Swin Transformer Block层相连,Swin Transformer Block层利用自注意力层与在Swin Transformer Block层之后的Patch Merging层相连,其中,自注意力层包括在Swin Transformer Block层中,自注意力层可以在Swin Transformer Block层中通过窗口多头自注意力层(W-MSA)和移位窗口多头自注意力层(SW-MSA)共同组成,Patch Merging层利用全连接层和通道归一化操作与在Patch Merging层之后的Swin Transformer Block层相连,其中,全连接层和通道归一化操作包括在Patch Merging层中,Patch Merging层通过多个全连接层、再经过通道归一化层组成,编码器的最后一个Patch Merging层与瓶颈层连接;
瓶颈层由2个Swin Transformer Block层依次连接组成,编码器的输出层通过通道归一化操作连接到瓶颈层的第一个Swin Transformer Block层,其中,通道归一化操作包括在编码器的输出层中,瓶颈层的第二个Swin Transformer Block层通过全连接层连接到解码器的输入层,其中,全连接层包括在第二个Swin Transformer Block层中;
解码器由3个Patch Expanding层、3个Swin Transformer Block层、Patch Projection层、输出层连接组成,解码器的第一个Patch Expanding层与瓶颈层的第二个Swin Transformer Block层连接,在解码器中,Patch Expanding层和Swin Transformer Block层交替相连,Patch Expanding层利用全连接层和通道归一化操作与Swin Transformer Block层相连,Swin Transformer Block层利用自注意力层与Patch Projection层连接,Patch Projection层利用卷积核为1、步长为1、填充为0的卷积与输出层相连;
编码器的3个Swin Transformer Block层与解码器的3个Swin Transformer Block层一一对应连接。
Swin Transformer Block层是模型的基本单元,如图4所示,Swin Transformer Block层由LayerNorm(LN)层、窗口多头自注意力层(W-MSA)、移位窗口多头自注意力层(SW-MSA)、MLP层组成,其中,LayerNorm层是通道归一化操作,W-MSA和SW-MSA层都由2个全连接层组成,并在全连接层后加入了激活函数Softmax,SW-MSA层在激活函数Softmax后加入了移位和切片操作,MLP层由2个全连接层组成,并在2个全连接层间加入了激活函数GELU:
输入特征z l-1先经过LayerNorm层,再经过窗口多头自注意力层,接着经过相加操作得到
Figure PCTCN2021139961-appb-000038
接着经过LayerNorm层、MLP层和相加操作得到z l,然后再经过LN层、移位窗口多头自注意力层和相加操作得到
Figure PCTCN2021139961-appb-000039
最后经过LayerNorm层、MLP层和相加操作得到输出特征z l+1,过程如式(2):
Figure PCTCN2021139961-appb-000040
式中,LN()表示经过LayerNorm层处理输出,MLP()表示经过MLP层处理输出,W-MSA()表示经过窗口多头自注意力层处理输出、SW-MSA()表示经过移位窗口多头自注意力层处理输出,LayerNorm层为通道归一化操作。
窗口多头自注意力层和移位窗口多头自注意力层是在每个窗口中计算自注意力Attention(Q,K,V),如式(3):
Figure PCTCN2021139961-appb-000041
式中,Q、K、V分别表示查询矩阵、键矩阵、值矩阵,d表示矩阵的维度,B表示偏置矩阵,SoftMax为激活函数。
编码器的第一个Swin Transformer Block层中MLP层神经元个数前者为48、后者为192,编码器的第二个Swin Transformer Block层中MLP层神经元个数前者为96、后者为384,编码器的第三个Swin Transformer Block层中MLP层神经元个数前者为192、后者为768,瓶颈层的Swin Transformer Block层中MLP层神经元个数前者和后者分别都为384和1536,解码器的每个Swin Transformer Block层中MLP层神经元个数与编码器的MLP层中相对应的神经元个数相等。
步骤3,如图5所示,将步骤1叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中进行训练,得到训练好的基于Transformer的Swin-Unet模型;具体为:
步骤3.1,将叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中得到重构图像;
步骤3.2,对步骤3.1得到的重构图像与其对应未叠加噪声的彩色纹理织物图像计算均方差损失,如式(4):
Figure PCTCN2021139961-appb-000042
式中,
Figure PCTCN2021139961-appb-000043
为重构图像,X(i)为重构图像对应未叠加噪声的彩色纹理织物图像,n为未叠加噪声的彩色纹理织物图像的个数,L MSE为损失函数;
步骤3.3,以最小化L MSE为优化目标参数,采用AdamW优化器使损失函数达到最小,学习率为0.0001,设置最大迭代次数对图像进行训练,得到训练好的基于Transformer的Swin-Unet模型。
步骤4,如图6所示,使用步骤3训练好的基于Transformer的Swin-Unet模型对待测彩色纹理织物图像进行重构,输出对应的重构图像,然后根据重构图像判断并定位缺陷区域,具体为:
步骤4.1,将待测彩色织物图像输入到步骤3训练好的基于Transformer的Swin-Unet模型,得到 对应的重构图像;
步骤4.2,将输入的待测彩色织物图像和其对应的重构图像分别进行灰度化,如式(5):
X Gray=0.2125·X r+0.7154·X g+0.0721·X b       (5)
式中,X Gray表示灰度化后的图像;X r、X g、X b分别为待测彩色织物图像或对应的重构图像对应的RGB三个不同颜色通道下的像素值;
步骤4.3,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间对应像素点灰度值之差的绝对值,如式(6):
Figure PCTCN2021139961-appb-000044
式中,X Gray为灰度化后的待测织物图像,
Figure PCTCN2021139961-appb-000045
为灰度化后的待测织物图像对应的重构图像,X Residual为残差图像;
步骤4.4,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间的结构相似性,如式(7):
Figure PCTCN2021139961-appb-000046
式中,μ X
Figure PCTCN2021139961-appb-000047
分别为待测织物图像和对应重构图像的灰度平均值,即灰度像素平均值,σ X
Figure PCTCN2021139961-appb-000048
分别为待测织物图像和对应重构图像的灰度标准差,
Figure PCTCN2021139961-appb-000049
为待测织物图像和对应重构图像之间的协方差,C 1和C 2是防止分母为0的常数,
Figure PCTCN2021139961-appb-000050
为从亮度、对比度和结构信息这三个方面衡量两幅图像之间的相似性,以给定的步长在图像平面上移动滑动窗口,对重叠区域的相似性取平均值,得到结构相似性图像X SSIM
步骤4.5,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间梯度幅值相似性,如式(8):
Figure PCTCN2021139961-appb-000051
式中,i为图像中像素值的位置,X GMS为梯度幅值相似性,c为防止分母为0的常数,
Figure PCTCN2021139961-appb-000052
Figure PCTCN2021139961-appb-000053
分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,梯度幅值图像定义如式(9):
Figure PCTCN2021139961-appb-000054
式中,
Figure PCTCN2021139961-appb-000055
为卷积操作,X Gray为灰度化后的待测织物图像,
Figure PCTCN2021139961-appb-000056
为与待测织物图像对应的灰度化后的重构图像,
Figure PCTCN2021139961-appb-000057
Figure PCTCN2021139961-appb-000058
分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,h x和h y分别为Prewitt滤波器在水平方向和垂直方向上的滤波;
基于上述式(8)计算的每个像素点的梯度幅值相似性,可组成梯度幅值相似性图;
步骤4.6,将步骤4.5得到的梯度幅值相似性图采用归一化操作,如式(10):
Figure PCTCN2021139961-appb-000059
式中,
Figure PCTCN2021139961-appb-000060
为梯度幅值相似性图中最小的像素值,
Figure PCTCN2021139961-appb-000061
为梯度幅值相似性图中最大的像素值,
Figure PCTCN2021139961-appb-000062
为归一化后的梯度幅值相似性图;
步骤4.7,将步骤4.3得到的残差图像和步骤4.4得到的结构相似性图像以及步骤4.6得到的归一化后的梯度幅值相似性图像进行点乘融合,即,三个矩阵之间的逐元素相乘,如式(11):
Figure PCTCN2021139961-appb-000063
式中,X Residual为残差图像,X SSIM为结构相似性图像,
Figure PCTCN2021139961-appb-000064
为归一化后的梯度幅值相似性图像,X Fusion为乘法融合后的融合图像;
步骤4.8,将步骤4.7得到的融合图像加入高斯滤波,采用高斯卷积核对图像进行滑窗操作,得到滤波后的图像,如式(12):
X Fusion&Gaussian=X Fusion*G(x,y)          (12)
式中,X Fusion为融合图像,X Fusion&Gaussian为经过高斯滤波后的融合图像,*为滑窗卷积操作,G(x,y)为高斯核函数,高斯卷积核大小为3×3,如式(13):
Figure PCTCN2021139961-appb-000065
式中,(x,y)为融合图像的像素坐标,σ x和σ y分别为融合图像的x轴、y轴方向的像素标准差;
步骤4.9,将步骤4.8得到的经过高斯滤波后的融合图像采用自适应阈值的方法确定阈值,并进行二值化处理得到二值图像,如式(14):
Figure PCTCN2021139961-appb-000066
式中,p为二值化图像的像素值,T为图像自适应阈值,μ和σ分别为经过高斯滤波后的融合图像的均值和方差,ε为方差的系数,例如ε=3.5,若图像中某一点的像素值低于图像自适应阈值,像素值置为逻辑0,反之置为逻辑1;
步骤4.10,将经过步骤4.9得到的二值化图像进行闭运算操作得到最终检测结果图像,其中闭运算操作如式(15):
Figure PCTCN2021139961-appb-000067
式中,X binary为步骤4.9得到的二值化图像,E为3×3的闭运算结构元素,
Figure PCTCN2021139961-appb-000068
为图像膨胀操作,
Figure PCTCN2021139961-appb-000069
为图像腐蚀操作,X Closing为最终检测结果图像;
步骤4.11,将步骤4.10得到的最终检测结果图像来检测缺陷是否存在和定位缺陷区域,若最终检测结果图像存在像素值为255的白色区域,即可判定待检测的彩色纹理织物图像存在缺陷,缺陷区域为白色区域所在的位置。
下面以具体实施例对本发明一种针对彩色纹理织物缺陷区域的检测方法进行说明:
实验装置准备:硬件环境配置为Intel(R)Core(TM)i7-6850K CPU;显卡为GeForce RTX 3090(24G);内存128G。软件配置为:操作系统为Ubuntu 18.04.5LTS;深度学习框架为PyTorch1.7.1;环境基于Anaconda3和Python3.6.2。
建立彩色纹理织物数据集:根据彩色织物图案的复杂程度,将其分为三类不同纹理的织物图像:简单格子(SL)、条形格子(SP)和复杂格子(CL)。实验准备了4种不同图案的彩色纹理织物数据集,分别为SP3、SP5、SP24和CL1,每种数据集包含用于训练的无缺陷样本和用于测试的有缺陷样本,所有图像均整理成512×512×3大小的分辨率,附图1为彩色纹理织物训练集中的部分无缺陷图像,附图2为彩色纹理织物测试集中的部分缺陷图像。
实验评价指标:采用像素级评价指标中的综合评价指标(F1-measure,F1)、平均交并比(IoU)作为评价指标。F1-measure可更加全面地评估检测性能,IoU表示检测到缺陷区域和真实缺陷区域的靠近程度,评价指标定义如式(16-17):
Figure PCTCN2021139961-appb-000070
Figure PCTCN2021139961-appb-000071
式中,TP表示缺陷区域被成功检出的像素个数;FP表示无缺陷区域被误检为缺陷区域的像素个数;FN表示有缺陷区域未被检出的像素个数。
实验过程:首先,建立彩色纹理织物数据集,其中包括彩色纹理织物无缺陷图像训练集和有缺陷图像测试集;其次,构建一种基于Transformer的Swin-Unet模型;接着,训练模型使模型具有重构正常样本并修复缺陷区域的能力;最后,对待测彩色纹理织物图像进行缺陷检测,通过计算待测彩色纹理织物图像和对应重构图像之间的差异,并结合提出的后处理方法,实现对缺陷区域的检测并定位。
实验结果定性分析:本申请提出的Swin-Unet模型与UDCAE模型的检测结果进行了定性对比,部分检测结果如图7所示;通过附图7可见,Swin-Unet模型能够准确地检测出4种数据集的缺陷区 域,UDCAE模型虽然也能够检测出缺陷区域,但存在许多过检的情况,相比之下,Swin-Unet模型能够更精确地检测并定位缺陷区域,且检测结果更接近真实缺陷区域。
实验结果定量分析:本申请提出的Swin-Unet模型与UDCAE模型的检测结果在评价指标F1和IoU上进行了定量对比,F1和IoU的值越大表明检测结果越好,对比结果如表1所示。
表1 UDCAE和Swin-Unet模型检测结果不同评价指标对比
Figure PCTCN2021139961-appb-000072
通过表1可知,在这四个数据集上,两个模型评价指标的数值相差均在5%以上,Swin-Unet模型在F1和IoU两项评价指标下均取得了比UDCAE模型更高的值,UDCAE模型由于存在大量过检,导致F1和IoU的值较低。因此,在F1和IoU二项评价指标下,Swin-Unet模型比UDCAE模型效果更好。
实验总结:本发明提出的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,实质属于是一种基于Transformer的Swin-Unet的模型,在无需缺陷样本和不需人工标记的情况下,所构建的无监督模型能够有效重构正常样本并修复缺陷区域,通过计算待测彩色织物图像和对应重构图像之间的差异,并结合改进的后处理方法,实现对缺陷区域快速且准确地检测和定位。该方法无需大量人工标注的缺陷样本,可以有效避开缺陷样本数量稀缺、缺陷种类不平衡、人工构造特征成本高等实际问题。实验结果表明,该方法的检测精度和速度能够满足彩色纹理织物生产检测的工艺要求,为实际服装行业提供了一种易于工程实践的自动缺陷检测方案。

Claims (10)

  1. 一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,具体按照如下步骤实施:
    步骤1,建立包括彩色纹理无缺陷图像的彩色纹理织物数据集,对彩色纹理织物数据集中的彩色纹理无缺陷图像叠加噪声;
    步骤2,构建基于Transformer的Swin-Unet模型;
    步骤3,将步骤1叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中进行训练,得到训练好的基于Transformer的Swin-Unet模型;
    步骤4,使用步骤3训练好的基于Transformer的Swin-Unet模型对待测彩色纹理织物图像进行重构,输出对应的重构图像,然后根据重构图像判断并定位缺陷区域。
  2. 根据权利要求1所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述步骤1具体为:
    步骤1.1,建立彩色纹理织物数据集,彩色纹理织物数据包括彩色纹理织物无缺陷图像训练集和彩色纹理织物有缺陷图像测试集,彩色纹理织物数据集中的所有图像均整理成512×512×3大小的分辨率,图像格式均为.jpg;
    步骤1.2,对步骤1.1中的彩色纹理织物无缺陷图像训练集中的彩色纹理织物无缺陷图像叠加噪声,如式(1)所示:
    Figure PCTCN2021139961-appb-100001
    式中,X为彩色纹理织物无缺陷图像,N(0,0.1)表示服从均值为0、方差为0.1的标准正态分布的高斯噪声,
    Figure PCTCN2021139961-appb-100002
    为叠加噪声后的彩色纹理织物无缺陷图像。
  3. 根据权利要求2所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述步骤2中的基于Transformer的Swin-Unet模型具体为:
    所述基于Transformer的Swin-Unet模型为一个基于Transformer的U型对称编码器-解码器结构,分别由编码器、瓶颈层、解码器依次连接组成,编码器的输入层为叠加噪声彩色纹理织物无缺陷图像,解码器的输出层为重构后的彩色纹理织物图像,编码器和解码器之间通过3个跳接层相互连接。
  4. 根据权利要求3所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述编码器由输入层、Patch Embedding层、3个Swin Transformer Block层和3个Patch Merging层连接组成,其中,Swin Transformer Block层和Patch Merging层交替相连,所述Patch Embedding层利用卷积核为4、步长为4、填充为0的卷积与Swin Transformer Block层相连,所述Swin Transformer Block层利用自注意力层与在Swin Transformer Block层之后的Patch Merging层相连,其中,自注意力层包括在所述Swin Transformer Block层中,所述Patch Merging层利用全连接层和通道归一化操作与在Patch Merging层之后的Swin Transformer Block层相连,其中,全连接层和通道归一化操作包括在所述Patch Merging层中,编码器的最后一个所述Patch Merging层与所述瓶颈层连接;
    所述瓶颈层由2个Swin Transformer Block层依次连接组成,编码器的输出层通过通道归一化操作连接到瓶颈层的第一个Swin Transformer Block层,其中,通道归一化操作包括在所述编码器的输出层中,瓶颈层的第二个Swin Transformer Block层通过全连接层连接到解码器的输入层,其中,全连接层包括在所述第二个Swin Transformer Block层中;
    所述解码器由3个Patch Expanding层、3个Swin Transformer Block层、Patch Projection层、输出层连接组成,解码器的第一个所述Patch Expanding层与所述瓶颈层的第二个所述Swin Transformer Block层连接,在解码器中,Patch Expanding层和Swin Transformer Block层交替相连,所述Patch Expanding层利用全连接层和通道归一化操作与Swin Transformer Block层相连,所述Swin Transformer  Block层利用自注意力层与Patch Projection层连接,所述Patch Projection层利用卷积核为1、步长为1、填充为0的卷积与输出层相连;
    所述编码器的3个Swin Transformer Block层与解码器的3个Swin Transformer Block层一一对应连接。
  5. 根据权利要求4所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述Swin Transformer Block层由LayerNorm层、窗口多头自注意力层、移位窗口多头自注意力层、MLP层组成,所述LayerNorm层是通道归一化操作,所述窗口多头自注意力层和移位窗口多头自注意力层都由2个全连接层组成,并在全连接层后加入了激活函数Softmax,移位窗口多头自注意力层在激活函数Softmax后加入了移位和切片操作,所述MLP层由2个全连接层组成,并在2个全连接层间加入了激活函数GELU,按照以下方式连接:
    输入特征z l-1先经过LayerNorm层,再经过窗口多头自注意力层,接着经过相加操作得到
    Figure PCTCN2021139961-appb-100003
    接着经过LayerNorm层、MLP层和相加操作得到z l,然后再经过LN层、移位窗口多头自注意力层和相加操作得到
    Figure PCTCN2021139961-appb-100004
    最后经过LayerNorm层、MLP层和相加操作得到输出特征z l+1,过程如式(2):
    Figure PCTCN2021139961-appb-100005
    式中,LN()表示经过LayerNorm层处理输出,MLP()表示经过MLP层处理输出,W-MSA()表示经过窗口多头自注意力层处理输出、SW-MSA()表示经过移位窗口多头自注意力层处理输出,LayerNorm层为通道归一化操作。
  6. 根据权利要求5所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述窗口多头自注意力层和移位窗口多头自注意力层是在每个窗口中计算自注意力Attention(Q,K,V),如式(3):
    Figure PCTCN2021139961-appb-100006
    式中,Q、K、V分别表示查询矩阵、键矩阵、值矩阵,d表示矩阵的维度,B表示偏置矩阵,SoftMax为激活函数。
  7. 根据权利要求6所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述编码器的第一个Swin Transformer Block层中MLP层神经元个数前者为48、后者为192,编码器的第二个Swin Transformer Block层中MLP层神经元个数前者为96、后者为384,编码器的第三个Swin Transformer Block层中MLP层神经元个数前者为192、后者为768,所述瓶颈层的Swin Transformer Block层中MLP层神经元个数前者和后者分别都为384和1536,所述解码器的每个Swin Transformer Block层中MLP层神经元个数与编码器的MLP层中相对应的神经元个数相等。
  8. 根据权利要求7所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述步骤3具体为:
    步骤3.1,将叠加噪声的彩色纹理织物无缺陷图像输入到步骤2构建的基于Transformer的Swin-Unet模型中得到重构图像;
    步骤3.2,对步骤3.1得到的重构图像与其对应未叠加噪声的彩色纹理织物图像计算均方差损失,如式(4):
    Figure PCTCN2021139961-appb-100007
    式中,
    Figure PCTCN2021139961-appb-100008
    为重构图像,X(i)为重构图像对应未叠加噪声的彩色纹理织物图像,n为未叠加噪声的彩色纹理织物图像的个数,L MSE为损失函数;
    步骤3.3,以最小化L MSE为优化目标参数,采用AdamW优化器使损失函数达到最小,学习率为0.0001,设置最大迭代次数对图像进行训练,得到训练好的基于Transformer的Swin-Unet模型。
  9. 根据权利要求8所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述步骤4具体为:
    步骤4.1,将待测彩色织物图像输入到步骤3训练好的基于Transformer的Swin-Unet模型,得到对应的重构图像;
    步骤4.2,将输入的待测彩色织物图像和其对应的重构图像分别进行灰度化,如式(5):
    X Gray=0.2125·X r+0.7154·X g+0.0721·X b  (5)
    式中,X Gray表示灰度化后的图像;X r、X g、X b分别为待测彩色织物图像或对应的重构图像对应的RGB三个不同颜色通道下的像素值;
    步骤4.3,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间对应像素点灰度值之差的绝对值,如式(6):
    Figure PCTCN2021139961-appb-100009
    式中,X Gray为灰度化后的待测织物图像,
    Figure PCTCN2021139961-appb-100010
    为灰度化后的待测织物图像对应的重构图像,X Residual为残差图像;
    步骤4.4,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间的结构相似性,如式(7):
    Figure PCTCN2021139961-appb-100011
    式中,μ X
    Figure PCTCN2021139961-appb-100012
    分别为待测织物图像和对应重构图像的灰度像素平均值,σ X
    Figure PCTCN2021139961-appb-100013
    分别为待测织物图像和对应重构图像的灰度标准差,
    Figure PCTCN2021139961-appb-100014
    为待测织物图像和对应重构图像之间的协方差,C 1和C 2是防止分母为0的常数,
    Figure PCTCN2021139961-appb-100015
    为从亮度、对比度和结构信息这三个方面衡量两幅图像之间的相似性,以给定的步长在图像平面上移动滑动窗口,对重叠区域的相似性取平均值,得到结 构相似性图像X SSIM
    步骤4.5,计算步骤4.2中灰度化后的待测织物图像和对应的重构图像之间梯度幅值相似性,如式(8):
    Figure PCTCN2021139961-appb-100016
    式中,i为图像中像素值的位置,X GMS为梯度幅值相似性,c为防止分母为0的常数,
    Figure PCTCN2021139961-appb-100017
    Figure PCTCN2021139961-appb-100018
    分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,梯度幅值图像定义如式(9):
    Figure PCTCN2021139961-appb-100019
    Figure PCTCN2021139961-appb-100020
    式中,
    Figure PCTCN2021139961-appb-100021
    为卷积操作,X Gray为灰度化后的待测织物图像,
    Figure PCTCN2021139961-appb-100022
    为与待测织物图像对应的灰度化后的重构图像,
    Figure PCTCN2021139961-appb-100023
    Figure PCTCN2021139961-appb-100024
    分别为灰度化后的待测织物图像和灰度化后的重构图像对应的梯度幅值图像,h x和h y分别为Prewitt滤波器在水平方向和垂直方向上的滤波;
    基于式(8)计算的每个像素点的梯度幅值相似性,组成梯度幅值相似性图;
    步骤4.6,将步骤4.5得到的梯度幅值相似性图采用归一化操作,如式(10):
    Figure PCTCN2021139961-appb-100025
    式中,
    Figure PCTCN2021139961-appb-100026
    为梯度幅值相似性图中最小的像素值,
    Figure PCTCN2021139961-appb-100027
    为梯度幅值相似性图中最大的像素值,
    Figure PCTCN2021139961-appb-100028
    为归一化后的梯度幅值相似性图;
    步骤4.7,将步骤4.3得到的残差图像和步骤4.4得到的结构相似性图像以及步骤4.6得到的归一化后的梯度幅值相似性图像进行点乘融合,如式(11):
    Figure PCTCN2021139961-appb-100029
    式中,X Residual为残差图像,X SSIM为结构相似性图像,
    Figure PCTCN2021139961-appb-100030
    为归一化后的梯度幅值相似性图像,X Fusion为乘法融合后的融合图像;
    步骤4.8,将步骤4.7得到的融合图像加入高斯滤波,采用高斯卷积核对图像进行滑窗操作,得到滤波后的图像,如式(12):
    X Fusion&Gaussian=X Fusion*G(x,y)  (12)
    式中,X Fusion为融合图像,X Fusion&Gaussian为经过高斯滤波后的融合图像,*为滑窗卷积操作,G(x,y)为高斯核函数,如式(13):
    Figure PCTCN2021139961-appb-100031
    式中,(x,y)为融合图像的像素坐标,σ x和σ y分别为融合图像的x轴、y轴方向的像素标准差;
    步骤4.9,将步骤4.8得到的经过高斯滤波后的融合图像采用自适应阈值的方法确定阈值,并进行二值化处理得到二值图像,如式(14):
    Figure PCTCN2021139961-appb-100032
    式中,p为二值化图像的像素值,T为图像自适应阈值,μ和σ分别为经过高斯滤波后的融合图像的均值和方差,ε为方差的系数,若图像中某一点的像素值低于图像自适应阈值,像素值置为逻辑0,反之置为逻辑1;
    步骤4.10,将经过步骤4.9得到的二值化图像进行闭运算操作得到最终检测结果图像,其中闭运算操作如式(15):
    Figure PCTCN2021139961-appb-100033
    式中,X binary为步骤4.9得到的二值化图像,E为3×3的闭运算结构元素,
    Figure PCTCN2021139961-appb-100034
    为图像膨胀操作,
    Figure PCTCN2021139961-appb-100035
    为图像腐蚀操作,X Closing为最终检测结果图像;
    步骤4.11,将步骤4.10得到的最终检测结果图像来检测缺陷是否存在和定位缺陷区域,若最终检测结果图像存在像素值为255的白色区域,即可判定待检测的彩色纹理织物图像存在缺陷,缺陷区域为白色区域所在的位置。
  10. 根据权利要求9所述的一种基于自注意力的彩色纹理织物缺陷区域的检测方法,其特征在于,所述步骤4.5中Prewitt滤波器大小为3×3,其在水平方向和垂直方向的滤波参数分别为
    Figure PCTCN2021139961-appb-100036
    所述步骤4.7中的点乘融合为三个矩阵之间的逐元素相乘,所述步骤4.8中的高斯卷积核大小为3×3,所述步骤4.9中自适应阈值的方法中的参数ε根据经验设置为3.5。
PCT/CN2021/139961 2021-10-27 2021-12-21 一种基于自注意力的彩色纹理织物缺陷区域的检测方法 WO2023070911A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111257379.0A CN113989228A (zh) 2021-10-27 2021-10-27 一种基于自注意力的彩色纹理织物缺陷区域的检测方法
CN202111257379.0 2021-10-27

Publications (1)

Publication Number Publication Date
WO2023070911A1 true WO2023070911A1 (zh) 2023-05-04

Family

ID=79742771

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139961 WO2023070911A1 (zh) 2021-10-27 2021-12-21 一种基于自注意力的彩色纹理织物缺陷区域的检测方法

Country Status (2)

Country Link
CN (1) CN113989228A (zh)
WO (1) WO2023070911A1 (zh)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363441A (zh) * 2023-05-31 2023-06-30 克拉玛依市百事达技术开发有限公司 具备标记功能的管道腐蚀检测系统
CN116563279A (zh) * 2023-07-07 2023-08-08 山东德源电力科技股份有限公司 基于计算机视觉的量测开关检测方法
CN116630950A (zh) * 2023-07-25 2023-08-22 济南大学 一种高精度识别轮辋焊缝的方法
CN116823825A (zh) * 2023-08-29 2023-09-29 山东海德尼克液压科技有限公司 一种阀门配件铸造缺陷智能识别方法
CN116843689A (zh) * 2023-09-01 2023-10-03 山东众成菌业股份有限公司 一种菌盖表面破损检测方法
CN116843685A (zh) * 2023-08-31 2023-10-03 山东大学 一种基于图像检测的3d打印工件缺陷识别方法及系统
CN117078608A (zh) * 2023-08-06 2023-11-17 武汉纺织大学 一种基于双掩码引导的高反光皮革表面缺陷检测方法
CN117094999A (zh) * 2023-10-19 2023-11-21 南京航空航天大学 一种跨尺度缺陷检测方法
CN117133059A (zh) * 2023-08-18 2023-11-28 北京科技大学 一种基于局部注意力机制的人脸活体检测方法及装置
CN117291922A (zh) * 2023-11-27 2023-12-26 浙江日井泵业股份有限公司 一种不锈钢多级泵叶轮缺陷视觉检测方法
CN117372431A (zh) * 2023-12-07 2024-01-09 青岛天仁微纳科技有限责任公司 一种纳米压印模具的图像检测方法
CN117368122A (zh) * 2023-12-07 2024-01-09 津泰(天津)医疗器械有限公司 一种基于比色卡的frd宫颈染色实时比对方法
CN117557622A (zh) * 2023-10-13 2024-02-13 中国人民解放军战略支援部队航天工程大学 一种融合渲染信息的航天器位姿估计方法
CN117635590A (zh) * 2023-12-12 2024-03-01 深圳市英伟胜科技有限公司 一种笔记本电脑外壳的缺陷检测方法、缺陷检测装置和存储介质
CN117746045A (zh) * 2024-02-08 2024-03-22 江西师范大学 一种Transformer和卷积融合的医学图像分割方法及系统
CN117830299A (zh) * 2024-03-04 2024-04-05 湖南南源新材料有限公司 一种无纺布表面缺陷检测方法及系统
CN117934980A (zh) * 2024-03-25 2024-04-26 山东山科数字经济研究院有限公司 基于注意力监督调整的玻璃容器缺陷检测方法及系统
CN118097320A (zh) * 2024-04-29 2024-05-28 浙江大学 一种双分支的晶圆sem缺陷图分类和分割方法、系统
CN118115449A (zh) * 2024-02-26 2024-05-31 中国矿业大学 一种针对复杂纹理工业品的缺陷检测方法
CN118154476A (zh) * 2024-05-09 2024-06-07 山东浪潮科学研究院有限公司 一种全局文字图像修复方法及装置、介质
CN118154585A (zh) * 2024-05-09 2024-06-07 山东鑫泰新材料科技有限公司 一种冷轧钢板缺陷图像数据处理方法
CN118196070A (zh) * 2024-04-08 2024-06-14 江苏海洋大学 一种基于无人机热红外遥感的光伏板缺陷识别方法
CN118279907A (zh) * 2024-06-03 2024-07-02 菏泽学院 一种基于Transformer与CNN的中草药图像识别系统
CN118365974A (zh) * 2024-06-20 2024-07-19 山东省水利科学研究院 一种基于混合神经网络的水质类别检测方法、系统及设备

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2617555B (en) * 2022-04-07 2024-06-26 Milestone Systems As Image processing method, apparatus, computer program and computer-readable data carrier
CN114494254B (zh) * 2022-04-14 2022-07-05 科大智能物联技术股份有限公司 基于GLCM与CNN-Transformer融合的产品外观缺陷分类方法、存储介质
CN114841977B (zh) * 2022-05-17 2023-04-25 南京信息工程大学 一种基于Swin Transformer结构结合SSIM和GMSD的疵点检测方法
CN114820631B (zh) * 2022-07-04 2022-09-20 南通中豪超纤制品有限公司 一种抗纹理干扰的面料缺陷检测方法
CN115018750B (zh) * 2022-08-08 2022-11-08 湖南大学 中波红外高光谱及多光谱图像融合方法、系统及介质
CN115082490B (zh) * 2022-08-23 2022-11-15 腾讯科技(深圳)有限公司 异常预测方法、异常预测模型的训练方法、装置及设备
CN115578406B (zh) * 2022-12-13 2023-04-07 四川大学 基于上下文融合机制的cbct颌骨区域分割方法及系统
CN116703885A (zh) * 2023-06-30 2023-09-05 南京邮电大学 一种基于Swin Transformer的表面缺陷检测方法及系统
CN117173114A (zh) * 2023-08-28 2023-12-05 哈尔滨工业大学 一种基于图像重建的互连铟柱缺陷检测方法
CN117974648B (zh) * 2024-03-29 2024-06-04 中国机械总院集团江苏分院有限公司 一种织物瑕疵检测方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829903A (zh) * 2019-01-28 2019-05-31 合肥工业大学 一种基于卷积去噪自编码器的芯片表面缺陷检测方法
US20200175352A1 (en) * 2017-03-14 2020-06-04 University Of Manitoba Structure defect detection using machine learning algorithms
CN111402197A (zh) * 2020-02-09 2020-07-10 西安工程大学 一种针对色织物裁片缺陷区域的检测方法
CN111815601A (zh) * 2020-07-03 2020-10-23 浙江大学 一种基于深度卷积自编码器的纹理图像表面缺陷检测方法
CN112381794A (zh) * 2020-11-16 2021-02-19 哈尔滨理工大学 一种基于深度卷积生成网络的印刷缺陷检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200175352A1 (en) * 2017-03-14 2020-06-04 University Of Manitoba Structure defect detection using machine learning algorithms
CN109829903A (zh) * 2019-01-28 2019-05-31 合肥工业大学 一种基于卷积去噪自编码器的芯片表面缺陷检测方法
CN111402197A (zh) * 2020-02-09 2020-07-10 西安工程大学 一种针对色织物裁片缺陷区域的检测方法
CN111815601A (zh) * 2020-07-03 2020-10-23 浙江大学 一种基于深度卷积自编码器的纹理图像表面缺陷检测方法
CN112381794A (zh) * 2020-11-16 2021-02-19 哈尔滨理工大学 一种基于深度卷积生成网络的印刷缺陷检测方法

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116363441A (zh) * 2023-05-31 2023-06-30 克拉玛依市百事达技术开发有限公司 具备标记功能的管道腐蚀检测系统
CN116363441B (zh) * 2023-05-31 2023-08-08 克拉玛依市百事达技术开发有限公司 具备标记功能的管道腐蚀检测系统
CN116563279A (zh) * 2023-07-07 2023-08-08 山东德源电力科技股份有限公司 基于计算机视觉的量测开关检测方法
CN116563279B (zh) * 2023-07-07 2023-09-19 山东德源电力科技股份有限公司 基于计算机视觉的量测开关检测方法
CN116630950A (zh) * 2023-07-25 2023-08-22 济南大学 一种高精度识别轮辋焊缝的方法
CN117078608A (zh) * 2023-08-06 2023-11-17 武汉纺织大学 一种基于双掩码引导的高反光皮革表面缺陷检测方法
CN117078608B (zh) * 2023-08-06 2024-01-26 武汉纺织大学 一种基于双掩码引导的高反光皮革表面缺陷检测方法
CN117133059A (zh) * 2023-08-18 2023-11-28 北京科技大学 一种基于局部注意力机制的人脸活体检测方法及装置
CN117133059B (zh) * 2023-08-18 2024-03-01 北京科技大学 一种基于局部注意力机制的人脸活体检测方法及装置
CN116823825A (zh) * 2023-08-29 2023-09-29 山东海德尼克液压科技有限公司 一种阀门配件铸造缺陷智能识别方法
CN116823825B (zh) * 2023-08-29 2023-12-05 山东海德尼克液压科技有限公司 一种阀门配件铸造缺陷智能识别方法
CN116843685A (zh) * 2023-08-31 2023-10-03 山东大学 一种基于图像检测的3d打印工件缺陷识别方法及系统
CN116843685B (zh) * 2023-08-31 2023-12-12 山东大学 一种基于图像检测的3d打印工件缺陷识别方法及系统
CN116843689A (zh) * 2023-09-01 2023-10-03 山东众成菌业股份有限公司 一种菌盖表面破损检测方法
CN116843689B (zh) * 2023-09-01 2023-11-21 山东众成菌业股份有限公司 一种菌盖表面破损检测方法
CN117557622A (zh) * 2023-10-13 2024-02-13 中国人民解放军战略支援部队航天工程大学 一种融合渲染信息的航天器位姿估计方法
CN117094999A (zh) * 2023-10-19 2023-11-21 南京航空航天大学 一种跨尺度缺陷检测方法
CN117094999B (zh) * 2023-10-19 2023-12-22 南京航空航天大学 一种跨尺度缺陷检测方法
CN117291922B (zh) * 2023-11-27 2024-01-30 浙江日井泵业股份有限公司 一种不锈钢多级泵叶轮缺陷视觉检测方法
CN117291922A (zh) * 2023-11-27 2023-12-26 浙江日井泵业股份有限公司 一种不锈钢多级泵叶轮缺陷视觉检测方法
CN117372431A (zh) * 2023-12-07 2024-01-09 青岛天仁微纳科技有限责任公司 一种纳米压印模具的图像检测方法
CN117368122B (zh) * 2023-12-07 2024-02-13 津泰(天津)医疗器械有限公司 一种基于比色卡的frd宫颈染色实时比对方法
CN117372431B (zh) * 2023-12-07 2024-02-20 青岛天仁微纳科技有限责任公司 一种纳米压印模具的图像检测方法
CN117368122A (zh) * 2023-12-07 2024-01-09 津泰(天津)医疗器械有限公司 一种基于比色卡的frd宫颈染色实时比对方法
CN117635590A (zh) * 2023-12-12 2024-03-01 深圳市英伟胜科技有限公司 一种笔记本电脑外壳的缺陷检测方法、缺陷检测装置和存储介质
CN117746045B (zh) * 2024-02-08 2024-05-28 江西师范大学 一种Transformer和卷积融合的医学图像分割方法及系统
CN117746045A (zh) * 2024-02-08 2024-03-22 江西师范大学 一种Transformer和卷积融合的医学图像分割方法及系统
CN118115449A (zh) * 2024-02-26 2024-05-31 中国矿业大学 一种针对复杂纹理工业品的缺陷检测方法
CN117830299B (zh) * 2024-03-04 2024-05-17 湖南南源新材料有限公司 一种无纺布表面缺陷检测方法及系统
CN117830299A (zh) * 2024-03-04 2024-04-05 湖南南源新材料有限公司 一种无纺布表面缺陷检测方法及系统
CN117934980A (zh) * 2024-03-25 2024-04-26 山东山科数字经济研究院有限公司 基于注意力监督调整的玻璃容器缺陷检测方法及系统
CN117934980B (zh) * 2024-03-25 2024-05-31 山东山科数字经济研究院有限公司 基于注意力监督调整的玻璃容器缺陷检测方法及系统
CN118196070A (zh) * 2024-04-08 2024-06-14 江苏海洋大学 一种基于无人机热红外遥感的光伏板缺陷识别方法
CN118097320A (zh) * 2024-04-29 2024-05-28 浙江大学 一种双分支的晶圆sem缺陷图分类和分割方法、系统
CN118154476A (zh) * 2024-05-09 2024-06-07 山东浪潮科学研究院有限公司 一种全局文字图像修复方法及装置、介质
CN118154585A (zh) * 2024-05-09 2024-06-07 山东鑫泰新材料科技有限公司 一种冷轧钢板缺陷图像数据处理方法
CN118279907A (zh) * 2024-06-03 2024-07-02 菏泽学院 一种基于Transformer与CNN的中草药图像识别系统
CN118365974A (zh) * 2024-06-20 2024-07-19 山东省水利科学研究院 一种基于混合神经网络的水质类别检测方法、系统及设备

Also Published As

Publication number Publication date
CN113989228A (zh) 2022-01-28

Similar Documents

Publication Publication Date Title
WO2023070911A1 (zh) 一种基于自注意力的彩色纹理织物缺陷区域的检测方法
Huang et al. Fabric defect segmentation method based on deep learning
WO2023050563A1 (zh) 一种基于自编码器的彩色纹理织物缺陷区域的检测方法
Yuan et al. A deep convolutional neural network for detection of rail surface defect
CN111402197B (zh) 一种针对色织物裁片缺陷区域的检测方法
CN112381788B (zh) 一种基于双分支匹配网络的零部件表面缺陷增量检测方法
CN113643268B (zh) 基于深度学习的工业制品缺陷质检方法、装置及存储介质
CN112329588A (zh) 一种基于Faster R-CNN的管道故障检测方法
CN106373124B (zh) 基于灰度共生矩阵与ransac的工业产品表面缺陷视觉检测方法
CN113393438B (zh) 一种基于卷积神经网络的树脂镜片缺陷检测方法
CN114119500A (zh) 一种基于生成对抗网络的色织物缺陷区域的检测方法
CN113989224A (zh) 一种基于生成对抗网络的彩色纹理织物缺陷检测方法
CN113838040A (zh) 一种针对彩色纹理织物缺陷区域的检测方法
CN112465784B (zh) 一种地铁夹钳外观异常检测的方法
CN117237274A (zh) 基于生成对抗网络的数码印花织物缺陷检测方法
CN111161228B (zh) 一种基于迁移学习的纽扣表面缺陷检测方法
CN113902695B (zh) 一种针对色织物裁片缺陷区域的检测方法
Zhang et al. An improved DCGAN for fabric defect detection
CN111899221B (zh) 一种面向外观缺陷检测的自迁移学习方法
CN114863211A (zh) 一种基于深度学习的磁瓦缺陷检测及分割方法
CN113901947A (zh) 一种小样本下的轮胎表面瑕疵智能识别方法
CN110322437B (zh) 一种基于自动编码器和bp神经网络的织物缺陷检测方法
Wang et al. Patterned Fabric Defect Detection Based on Double-branch Parallel Improved Faster-RCNN
CN111833323A (zh) 基于稀疏表示与svm的分任务铁路货车图像质量判断方法
Zhang et al. A Fabric Defect Detection Algorithm Based on YOLOv8

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21962253

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE