WO2024051686A1 - 缺陷检测模型的压缩训练方法和装置 - Google Patents

缺陷检测模型的压缩训练方法和装置 Download PDF

Info

Publication number
WO2024051686A1
WO2024051686A1 PCT/CN2023/116994 CN2023116994W WO2024051686A1 WO 2024051686 A1 WO2024051686 A1 WO 2024051686A1 CN 2023116994 W CN2023116994 W CN 2023116994W WO 2024051686 A1 WO2024051686 A1 WO 2024051686A1
Authority
WO
WIPO (PCT)
Prior art keywords
defect detection
detection model
feature map
feature
sample image
Prior art date
Application number
PCT/CN2023/116994
Other languages
English (en)
French (fr)
Inventor
韩旭
颜聪
Original Assignee
东声(苏州)智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东声(苏州)智能科技有限公司 filed Critical 东声(苏州)智能科技有限公司
Publication of WO2024051686A1 publication Critical patent/WO2024051686A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present application relates to the field of defect detection technology of machine vision, and specifically, to a compression training method and device for a defect detection model.
  • the industry usually uses techniques such as pruning, quantification, and knowledge distillation of the model.
  • Knowledge distillation uses the supervision information (i.e., knowledge) of the large-scale teacher model to train a lightweight student model in order to achieve better performance and accuracy.
  • the supervision information of large-scale teacher models can come from the output feature knowledge of the teacher model or the intermediate layer feature knowledge.
  • this application proposes a compression training method and device for a defect detection model, thereby improving the defect detection model obtained by distillation and compression's ability to perceive the characteristics of defect images containing tiny defects, and improving the compressed defect detection model's ability to detect product appearance. Detection accuracy of tiny defects.
  • an embodiment of the present application proposes a compression training method for a defect detection model, including:
  • Each sample image in the sample image data set is input into the first defect detection model and the second defect detection model respectively, and the first feature map output by the target convolution layer in the first defect detection model and the second defect detection model are respectively extracted.
  • the corrected distance between the corresponding feature vectors of the map, and the sum of the corrected distances between all feature vectors of the first feature map and the second feature map is calculated as the first loss function;
  • the second defect detection model is iteratively trained to obtain the second defect detection model that has been distilled and compressed.
  • the segmentation labeling factor matrix is used to label the factor values corresponding to each pixel in each sample image, where the factor values for the pixels in the defective area of each sample image are the same as those for each sample.
  • the factor values of the pixels in the non-defect area of the image are opposite to each other.
  • the distance between the corresponding feature vectors of the first feature map and the second feature map is calculated, and the distance is corrected using the corresponding elements in the segmentation labeling factor matrix to obtain
  • the modified distance between the corresponding feature vectors of the first feature map and the second feature map includes:
  • the product of the squared Euclidean distance and the corresponding element in the segmentation labeling factor matrix is calculated to obtain the correction between the corresponding feature vectors of the first feature map and the second feature map.
  • distance including:
  • the method further includes:
  • the method further includes: for a plurality of sample images of each batch in the sample image data set, calculating each sample image to input the first defect detection model and the second defect detection model.
  • the second defect detection model is iteratively trained based on minimizing the average value of the total loss function of the model.
  • the method further includes: if the sizes of the first feature map and the second feature map are inconsistent, downsampling the first feature map or downsampling the second feature map. Perform upsampling to align the sizes of the first feature map and the second feature map.
  • another embodiment of the present application also proposes a compression training method for a defect detection model, including:
  • Each sample image in the sample image data set is input into a first defect detection model and a second defect detection model respectively, and a plurality of first feature maps output by multiple target convolution layers in the first defect detection model are respectively extracted and A plurality of second feature maps output by a plurality of corresponding target convolutional layers in the second defect detection model, wherein the second defect detection model belongs to the same architecture as the pre-trained first defect detection model but is more advanced.
  • Lightweight deep convolutional neural network model
  • the distance is corrected to obtain the corrected distance between the corresponding feature vectors of each first feature map and the corresponding second feature map, and the distance between all feature vectors of each first feature map and the corresponding second feature map is calculated.
  • the second defect detection model is iteratively trained to obtain the second defect detection model that has been distilled and compressed.
  • an embodiment of the present application proposes a compression training device for a defect detection model, including:
  • the segmentation labeling unit is used to segment and label defect areas on the sample image data set of product appearance, and obtain the segmentation labeling factor matrix of each sample image;
  • a feature extraction unit configured to input each sample image in the sample image data set into a first defect detection model and a second defect detection model respectively, and respectively extract the first feature output by the target convolution layer in the first defect detection model. and the second feature map output by the corresponding target convolution layer in the second defect detection model, wherein the second defect detection model belongs to the same architecture as the pre-trained first defect detection model but is more lightweight Deep convolutional neural network model;
  • the first loss assessment unit is used to calculate the distance between the corresponding feature vectors of the first feature map and the second feature map, and use the corresponding elements in the segmentation labeling factor matrix to correct the distance to obtain the The corrected distance between the corresponding feature vectors of the first feature map and the second feature map, and the sum of the corrected distances between all feature vectors of the first feature map and the second feature map is calculated as the first loss function;
  • a first iterative training unit is configured to iteratively train the second defect detection model based on minimizing the first loss function to obtain the second defect detection model that has been distilled and compressed.
  • another embodiment of the present application also proposes a compression training device for a defect detection model, including:
  • the segmentation labeling unit is used to segment and label defect areas on the sample image data set of product appearance, and obtain the segmentation labeling factor matrix of each sample image;
  • a feature extraction unit configured to input each sample image in the sample image data set into a first defect detection model and a second defect detection model respectively, and respectively extract multiple features output by multiple target convolution layers in the first defect detection model.
  • a first feature map and a plurality of second feature maps output by corresponding target convolutional layers in the second defect detection model, wherein the second defect detection model is the same as the pre-trained first defect detection model
  • the model belongs to the same architecture but more lightweight deep convolutional neural network model
  • a first loss assessment unit configured to sequentially calculate the distance between corresponding feature vectors of each first feature map and the corresponding second feature map among the plurality of first feature maps and second feature maps, using the segmentation Corresponding elements in the label factor matrix correct the distance to obtain the corrected distance between the corresponding feature vectors of each first feature map and the corresponding second feature map, and calculate each first feature map and the corresponding second feature map.
  • the sum of the correction distances of the feature maps between all feature vectors, and the accumulation of the sum of the correction distances of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps is calculated As the first loss function;
  • a first iterative training unit is configured to iteratively train the second defect detection model based on minimizing the first loss function to obtain the second defect detection model that has been distilled and compressed.
  • the embodiments of the present application can at least achieve the following beneficial effects: by correcting the distance between all feature vectors of the first feature map and the second feature map to divide the factor values in the label factor matrix, so that based on minimizing the first loss function
  • the defect detection model obtained by distillation and compression is improved in its feature perception ability of defect images containing small defects, and the detection accuracy of the compressed defect detection model in detecting small defects in product appearance is improved.
  • Figure 1 is a schematic flowchart of a compression training method for a defect detection model according to an embodiment of the present application
  • Figure 2 is a schematic network structure diagram of the first defect detection model ResNet101 and the second defect detection model ResNet18 according to an embodiment of the present application;
  • Figure 3 is a schematic flowchart of a compression training method for a defect detection model according to another embodiment of the present application.
  • Figure 4 is a schematic flowchart of a compression training method for a defect detection model according to another embodiment of the present application.
  • Figure 5 is a schematic structural diagram of a compression training device for a defect detection model according to an embodiment of the present application
  • Figure 6 is a schematic structural diagram of a compression training device for a defect detection model according to another embodiment of the present application.
  • Figure 7 is a partial structural schematic diagram of a compression training device for a defect detection model according to another embodiment of the present application.
  • the second deep learning defect detection model learns the characteristic knowledge output by the teacher model, so that the second deep learning defect detection model Features extracted from defect images containing tiny defects also have the above problems, that is, they cannot be significantly distinguished from features extracted from non-defect images.
  • the second deep learning defect detection model compressed by knowledge distillation is deployed in a mobile device and used to detect product appearance defects, it will affect the accuracy of classification and detection of minor defects in product appearance.
  • this application proposes a compression training method and device for a defect detection model.
  • Figure 1 is a schematic flowchart of a compression training method for a defect detection model according to an embodiment of the present application. As shown in Figure 1 As shown, the compression training method of the defect detection model in the embodiment of this application includes the following steps:
  • Step S110 Perform segmentation and labeling of defective areas on the sample image data set of product appearance to obtain a segmentation labeling factor matrix for each sample image.
  • the first step is to segment and label the defect area on the sample image data set of the product appearance to obtain the segmentation labeling factor matrix of each sample image.
  • the segmentation labeling factor matrix of each sample image is used to label each sample image.
  • the factor values corresponding to the pixels are assigned to the pixels in the defective area of each sample image with factor values that are different from the pixels in the non-defective area in the sample image, so that they can be used to detect the first defect in subsequent steps.
  • the distance between the first feature map extracted by the model and the second feature map extracted from the second defect detection model is corrected.
  • the size of the segmentation labeling factor matrix corresponds to the pixel size of each sample image, and a factor value is assigned to each pixel in the sample image at the corresponding pixel position of the segmentation labeling factor matrix.
  • the factor value for the pixels in the defective area and the factor value for the pixels in the non-defective area are opposite numbers to each other. Assuming that the segmentation labeling factor matrix of each sample image is expressed as A, then, for each pixel (i, j), the factor value A(i, j) of the pixel is expressed as:
  • Rd represents the set of pixels in the non-defective area in the sample image
  • Rn represents the set of pixels in the defective area in the sample image.
  • the pixels in the defective area in the sample image are assigned a factor value that is opposite to the factor value of the pixels in the non-defective area, so that when performing distillation learning training of the second defect detection model from the first defect detection model, it is possible to increase
  • the distance between the feature points corresponding to the defect areas of the sample image in the first loss function improves the feature perception ability of the second defect detection model obtained by distillation and compression for defect images containing small defects. This will be further elaborated below in conjunction with subsequent steps.
  • Step S120 Each sample image in the sample image data set is input into the first defect detection model and the second defect detection model respectively, and the first feature map output by the target convolution layer in the first defect detection model and the first feature map are respectively extracted.
  • the pre-trained first defect detection model is selected as the teacher model, and the randomly initialized second defect detection model is selected as the student model.
  • the first defect detection model is a large-scale deep convolutional neural network model
  • the second defect detection model belongs to the same architecture as the first defect detection model but is a more lightweight deep convolutional neural network model.
  • the second defect detection model The detection model is a compressed model obtained by distillation learning of the first defect detection model, and is finally deployed on a mobile device to perform classification detection of product appearance defect images.
  • the first defect The detection model can be selected from the deep residual network model ResNet50, ResNet101, ResNet152, etc.
  • the second defect detection model can be selected from the deep residual network model ResNet18.
  • the deep residual network model is only an exemplary optional implementation of the first defect detection model and the second defect detection model.
  • the first defect detection model and the second defect detection model are not limited to depth.
  • Residual network models and other deep convolutional neural network models suitable for defect classification and detection, such as Desnet, VGG network models, etc. are also applicable to different embodiments of the present application.
  • this embodiment can select the deeper ResNet101 as the first defect detection model and the shallower ResNet18 as the second defect detection model.
  • Figure 2 shows a schematic network structure diagram of the first defect detection model ResNet101 and the second defect detection model ResNet18. As shown in Figure 2, ResNet101 as the first defect detection model and ResNet18 as the second defect detection model have the same architecture, that is, both include five convolutional layer parts.
  • the five convolutional layers of ResNet101 are the first convolutional layer 210-1 (conv1), the second convolutional layer 220-1 (conv2_x), the third convolutional layer 230-1 (conv3_x), and the fourth convolutional layer 240-1(conv4_x) and the fifth convolutional layer 250-1(conv5_x).
  • the five convolutional layers of ResNet18 are the first convolutional layer 210-2 (conv1), the second convolutional layer 220-2 (conv2_x), the third convolutional layer 230-2 (conv3_x), and the fourth convolutional layer 240-2(conv4_x) and the fifth convolutional layer 250-2(conv5_x).
  • the first convolutional layers 210-1 and 210-2 are both preprocessing layers, the convolution kernel size is 7 ⁇ 7, and the convolution kernels are The number is 64.
  • the input sample image is preprocessed and a feature map of 112 ⁇ 112 ⁇ 64 is output.
  • 112 ⁇ 112 represents the width and height of the output feature map respectively, and 64 is the number of channels of the output feature map.
  • the second convolution layer 220-1 (conv2_x), the third convolution layer 230-1 (conv3_x), the fourth convolution layer 240-1 (conv4_x) and the fifth convolution layer 250 -1(conv5_x) includes 3, 4, 23, and 3 convolution blocks respectively.
  • Each convolution block includes 2 1 ⁇ 1 convolution units and 1 3 ⁇ 3 convolution unit.
  • the second convolution layer 220-2 (conv2_x), the third convolution layer 230-2 (conv3_x), the fourth convolution layer 240-2 (conv4_x) and the fifth convolution layer 250 -2(conv5_x) includes 2, 2, 2, and 2 convolution blocks respectively, and each convolution block includes 2 3 ⁇ 3 convolution units.
  • the second convolution layer 220-1 and 220-2 (conv2_x) output a 56 ⁇ 56 ⁇ 256 feature map
  • the third convolution layer 230-1 and 230-2 output 28 ⁇ 28 ⁇ 512 feature map
  • the fourth convolutional layer 240-1, 240-2 (conv4_x) outputs a 14 ⁇ 14 ⁇ 1024 feature map
  • the fifth convolutional layer 250-1, 250-2 (conv5_x) outputs 7 ⁇ 7 ⁇ 2048 feature map.
  • ResNet101 and ResNet18 are also processed through average pooling layers 260-1 and 260-2, fully connected layers 270-1 and 270-2, and softmax layers 280-1 and 280-2 respectively.
  • the predicted classification result of the sample image data is output, and the predicted classification result is presented in the form of a defect classification probability vector.
  • each sample image in the sample image data set is first input into the pre-trained first defect detection model and the randomly initialized second defect detection model, and then the target convolution layer in the first defect detection model is extracted respectively.
  • this application can select the last convolutional layer in the first defect detection model and the second defect detection model as the target convolutional layer, and extract the feature maps output by them.
  • the first feature map output by the target convolution layer of the first defect detection model is M 1 (I s )
  • the first feature map output by the target convolution layer of the second defect detection model is M 1 (I s ).
  • the second feature map is represented as M 2 (I s ), the size of M 1 (I s ) and M 2 (I s ) is W ⁇ H ⁇ C, W is the width of the feature map, H is the height of the feature map, C is the number of channels of the feature map.
  • Step S130 Calculate the distance between the corresponding feature vectors of the first feature map and the second feature map, use the corresponding elements in the segmentation labeling factor matrix to correct the distance, and obtain the first feature map and Correction distances between corresponding feature vectors of the second feature map, and the sum of correction distances between all feature vectors of the first feature map and the second feature map is calculated as the first loss function.
  • the dimension of the feature vector is the number of channels of the feature map. Therefore, for the position (m, n) of each feature point in the first feature map and the second feature map, the first feature vector M 1 ( I s ) m,n , the second feature vector M 2 (I s ) m,n corresponding to the feature point is extracted from the second feature map, and the first feature vector and the second feature vector constitute a corresponding feature vector pair. Then, the distance between the first feature vector and the second feature vector can be calculated.
  • the distance between the first feature vector and the second feature vector may be the squared Euclidean distance of the respective normalized vectors of the first feature vector and the second feature vector.
  • the normalized vector of the first eigenvector is expressed as The normalized vector of the second eigenvector is expressed as but:
  • the embodiment of the present application can divide the segmentation labeling factor matrix into Perform a size transformation operation to align the size of the segmentation labeling factor matrix to the width and height size W ⁇ H of the first feature map and the second feature map. This can be done by performing nearest neighbor interpolation or bilinear interpolation on the segmentation labeling factor matrix.
  • the scaling operation resize() is implemented. Taking the scaling operation of nearest neighbor interpolation as an example, the element positions in the segmented labeling factor matrix are reduced in equal proportions, corresponding to the target element positions of the segmented labeling factor matrix after transformation.
  • the size of the segmented labeling factor matrix after transformation is: Convert to W ⁇ H.
  • calculating the product of the squared Euclidean distance and the corresponding element in the segmentation labeling factor matrix may include calculating the product of the squared Euclidean distance and the corresponding element in the transformed segmentation labeling factor matrix, thereby obtaining the first The corrected distance between the eigenvector and the second eigenvector.
  • the post-transformation segmentation labeling factor matrix is expressed as A r
  • the corresponding element A r (m, n) can be found for each vector in the transformed segmentation labeling factor matrix A r .
  • This element is the correction factor for the distance between the first eigenvector and the second eigenvector. Therefore, the modified distance between the first eigenvector and the second eigenvector It can be expressed as:
  • the sum of the corrected distances between the feature vectors of all feature point positions of the first feature map and the second feature map is used as the first loss function Loss 1 (I s ), that is:
  • Step S140 Based on minimizing the first loss function, iteratively train the second defect detection model to obtain the second defect detection model that has been distilled and compressed.
  • the second defect detection model can be iteratively trained based on minimizing the first loss function, and iteratively updated under the conditions of a certain learning rate and batch size. parameters of the second defect detection model, and finally obtain the appropriate second defect detection model that has been distilled and compressed.
  • the compressed second defect detection model can then be deployed to the target mobile device to perform defect classification detection on product appearance images.
  • the first loss function corrects the distance between the feature vectors of the first feature map and the second feature map at all feature point positions with the segmentation labeling factor in the segmentation labeling factor matrix, and the segmentation labeling
  • the factor matrix assigns positive factor values to the pixels in the non-defective area in the sample image, and assigns positive factor values to the pixels in the defective area in the sample image.
  • the pixels are assigned a factor value that is opposite to the factor value of the pixels in the non-defect area, so that when the second defect detection model is trained by distillation learning based on minimizing the first loss function, on the one hand, the first defect detection model and
  • the distance between the non-defect image features extracted by the second defect detection model makes the non-defect image features extracted by the second defect detection model obtained through distillation and compression as similar as possible to the first defect detection model; on the other hand, while increasing the distance between the defect image features extracted by the first defect detection model and the second defect detection model, so that the defect image features extracted by the second defect detection model obtained through distillation and compression are the same as those of the first defect detection model.
  • defect image features extracted by the second defect detection model have significant distinction between the defect image features extracted from the defect images and the features extracted from the non-defect images, thereby improving the accuracy of the second defect detection model obtained by distillation and compression for defects containing small defects.
  • the characteristic perception ability of defect images improves the detection accuracy of small defects in product appearance by the compressed second defect detection model.
  • the size of the first feature map extracted by the first defect detection model is inconsistent with the size of the second feature map extracted by the second defect detection model, it usually manifests as the first feature map extracted by the first defect detection model. If the size of the map is larger than the size of the second feature map extracted by the second defect detection model, it is necessary to downsample the first feature map or upsample the second feature map, and compare the sizes of the first feature map and the second feature map. Size alignment, and then perform the above steps S130 and S140.
  • Figure 3 is a method flow chart of a compression training method for a defect detection model according to another embodiment of the present application.
  • the compression training method of the defect detection model according to the embodiment of the present application can further optimize and improve steps S120 and S130 based on any of the foregoing embodiments, and the following steps can be obtained:
  • Step S320 Each sample image in the sample image data set is input into a first defect detection model and a second defect detection model respectively, and a plurality of first defect detection models output by multiple target convolution layers in the first defect detection model are respectively extracted. Feature maps and multiple second feature maps output by corresponding multiple target convolutional layers in the second defect detection model, wherein the second defect detection model is the same as the pre-trained first defect detection model.
  • the pre-trained first defect detection model and the randomly initialized second defect detection model may be implemented in the same manner as in the previous embodiment, and will not be described again here.
  • each sample image data is input into the pre-trained first defect detection model and the randomly initialized second defect detection model, and multiple target convolutions are selected from the first defect detection model and the second defect detection model respectively.
  • selecting multiple target convolutional layers respectively may include selecting several consecutive convolutional layers from multiple convolutional layers of each of the first defect detection model and the second defect detection model as the target convolutional layer. .
  • the first convolution layer 210-1 (conv1) can be selected from the first defect detection model.
  • the first convolution layer 210-1 (conv1) and the second convolution layer 220-1 are selected from the second defect detection model (conv2_x) as the corresponding target convolution layer respectively, or select the fourth convolution layer 240-1 (conv4_x) and the fifth convolution layer 250-1 (conv5_x) from the first defect detection model, and select from the second defect detection model
  • the fourth convolution layer 240-2 (conv4_x) and the fifth convolution layer 250-2 (conv5_x) are selected as the corresponding target convolution layers respectively, and so on. In this way, multiple corresponding first feature maps and second feature maps can be extracted from multiple target convolution layers in the first defect detection model and the second defect detection model respectively.
  • Step S330 Calculate the distance between the corresponding feature vectors of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps in sequence, and use the corresponding feature vectors in the segmentation labeling factor matrix
  • the distance is corrected by the elements of , and the corrected distance between the corresponding feature vectors of each first feature map and the corresponding second feature map is obtained.
  • the sum of corrected distances between vectors, and the accumulation of the sum of corrected distances of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps is calculated as the first loss function .
  • L target convolution layers are selected from the first defect detection model and the second defect detection model respectively, and the output L is extracted from each target convolution layer of the first defect detection model and the second defect detection model respectively.
  • the distance between the l-th first feature map and the corresponding second feature map is at the feature point position (m, n)
  • the squared Euclidean distance of the corresponding normalized vectors of the first eigenvector and the second eigenvector is
  • the modified distance between the first eigenvector and the second eigenvector is expressed as That is, the corrected distance is represented by the product of the squared Euclidean distance and the corresponding elements in the size transformation matrix of the segmentation labeling factor matrix corresponding to the l-th first feature map and the corresponding second feature map, and is calculated using the following formula :
  • Ar ,l (m,n) is the size transformation matrix corresponding to the l-th first feature map and the corresponding second feature map of the segmentation labeling factor matrix corresponding to the feature point position (m,n).
  • the segmentation labeling factor matrix needs to perform corresponding size transformation operations for each first feature map, and the sizes of the segmentation labeling factor matrices are aligned to The width and height dimensions of each first feature map and second feature map.
  • the accumulation of the sum of the correction distances of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps is used as the first loss function, which can be calculated by the following formula:
  • W l and H l represent the width and height dimensions of the l-th first feature map and the corresponding second feature map respectively.
  • the distillation learning between the detection model and the second defect detection model enables the distillation learning training of the second defect detection model based on minimizing the first loss function, while reducing the first defect detection model and the second defect detection model.
  • the distance between the extracted non-defect image features is increased, the distance between the defect image features extracted by the first defect detection model and the second defect detection model is increased, so that the second defect detection model can extract the defect image features from the defect image. It is significantly different from the features extracted from non-defect images, thereby further improving the feature perception ability of the second defect detection model obtained by distillation and compression on defect images containing small defects, and further improving the ability of the compressed second defect detection model to detect products. Detection accuracy of minor defects in appearance.
  • the method described in the embodiment of this application may also include:
  • Step S410 After inputting each sample image into the second defect detection model, obtain the defect classification probability vector output by the second defect detection model;
  • Step S420 Calculate the cross-entropy loss between the defect classification probability vector and the classification annotation vector of the sample image data as the second loss function
  • Step S430 Calculate the weighted sum of the first loss function and the second loss function as the total loss function. Based on minimizing the total loss function, iteratively train the second defect detection model to obtain all distilled and compressed defects. Describe the second defect detection model.
  • the defect classification probability vector output by the second defect detection model is simultaneously obtained.
  • the defect classification probability vector may be a probability vector [c 1 , c 2 ,..., c K ] output through the softmax layer 280-2 as shown in Figure 2, where K is classified by a non-defect image and multiple defects The number of classes consisting of image classes.
  • This probability vector represents the predicted class probability for each sample image.
  • the cross-entropy loss between the defect classification probability vector of each sample image and the classification annotation vector (classification true value) of the sample image data is used as the second loss function Loss 2 (I s ).
  • Loss total (I s ) Loss 1 (I s ) + ⁇ Loss 2 (I s ), ⁇ is the sum of the first loss function
  • the weight coefficient of the second loss function can be adjusted based on the experience value during the training process.
  • the second defect detection model can be iteratively trained based on minimizing the weighted sum of the first loss function and the second loss function, and the parameters of the second defect detection model can be updated to obtain the distilled and compressed Second defect detection model.
  • the prediction loss of the second defect detection model itself is further considered, which can assist in improving the second defect detection model after distillation learning. Prediction accuracy of minor defects in product appearance.
  • the method further includes:
  • the average value of the total loss function for each batch can be calculated as:
  • the second defect detection model can be iteratively trained based on minimizing the average value of the total loss function Loss avg of each batch, and the parameters of the second defect detection model can be updated, thereby obtaining all distilled and compressed values. Describe the second defect detection model.
  • Figure 5 is a schematic structural diagram of a compression training device for a defect detection model according to an embodiment of the present application.
  • the compression training device of the defect detection model according to the embodiment of the present application includes the following module units:
  • the segmentation labeling unit 510 is used to segment and label the defect area on the sample image data set of the product appearance, and obtain the segmentation labeling factor matrix of each sample image.
  • the feature extraction unit 520 is configured to input each sample image in the sample image data set into a first defect detection model and a second defect detection model respectively, and respectively extract the first feature output by the target convolution layer in the first defect detection model.
  • the first loss assessment unit 530 is used to calculate the distance between the corresponding feature vectors of the first feature map and the second feature map, and use the corresponding elements in the segmentation labeling factor matrix to correct the distance to obtain the Calculate the corrected distance between the corresponding feature vectors of the first feature map and the second feature map, and calculate the sum of the corrected distances between all feature vectors of the first feature map and the second feature map as the first loss function.
  • the first iterative training unit 540 is configured to iteratively train the second defect detection model based on minimizing the first loss function to obtain the second defect detection model that has been distilled and compressed.
  • Figure 6 is a schematic structural diagram of a compression training device for a defect detection model according to another embodiment of the present application.
  • the compression training device of the defect detection model according to the embodiment of the present application includes the following module units:
  • the segmentation labeling unit 610 is used to segment and label the defect area on the sample image data set of the product appearance, and obtain the segmentation labeling factor matrix of each sample image.
  • the feature extraction unit 620 is configured to input each sample image in the sample image data set into the first defect detection model and the second defect detection model respectively, and respectively extract the features output by multiple target convolution layers in the first defect detection model.
  • a plurality of first feature maps and a plurality of second feature maps output by corresponding target convolutional layers in the second defect detection model, wherein the second defect detection model is pre-trained with the first defect The detection model belongs to the same architecture but more lightweight deep convolutional neural network model.
  • the first loss assessment unit 630 is configured to sequentially calculate the distance between the corresponding feature vectors of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps, using the Divide the corresponding elements in the labeling factor matrix to correct the distance, obtain the corrected distance between the corresponding feature vectors of each first feature map and the corresponding second feature map, and calculate each first feature map and the corresponding third feature map. The sum of the corrected distances between all feature vectors of the two feature maps, and calculate the sum of the corrected distances of each first feature map and the corresponding second feature map in the plurality of first feature maps and second feature maps. Accumulate as the first loss function.
  • the first iterative training unit 640 is configured to iteratively train the second defect detection model based on minimizing the first loss function to obtain the second defect detection model that has been distilled and compressed.
  • an embodiment of the present application may also include:
  • the probability vector obtaining unit 710 is configured to obtain the defect classification probability vector output by the second defect detection model after inputting each sample image into the second defect detection model.
  • the second loss evaluation unit 720 is configured to calculate the cross-entropy loss between the defect classification probability vector and the classification annotation vector of the sample image as a second loss function.
  • the second iterative training unit 730 is used to calculate the weighted sum of the first loss function and the second loss function as the total loss function, and iteratively train the second defect detection model based on minimizing the total loss function, The second defect detection model that has been distilled and compressed is obtained.
  • the device further includes:
  • a third iterative training unit configured to calculate, for each batch of sample images in the sample image data set, the total loss function of each sample image input to the first defect detection model and the second defect detection model.
  • the second defect detection model is iteratively trained based on an average value that minimizes the total loss function.
  • the embodiment of the present application improves the feature perception ability of the deep learning defect detection model after distillation and compression of defect images containing tiny defects by increasing the segmentation annotation factor of the image defect area during the compression training process of knowledge distillation of the deep learning defect detection model. , improve the detection accuracy of small defects in product appearance by the compressed deep learning defect detection model.
  • This application can be implemented through software, hardware, or a combination of software and hardware.
  • the computer software program can be installed in the memory of a computing device and executed by one or more processors to implement corresponding functions.
  • embodiments of the present application may also include a computer-readable medium that stores program instructions.
  • the computer-readable storage medium when the computer-readable storage medium is loaded in a computing device, the computer-readable storage medium stores program instructions.
  • the program instructions may be executed by one or more processors to perform the method steps described in any embodiment of the application.
  • embodiments of the present application may also include a computer program product, including a computer-readable medium carrying program instructions.
  • the program instructions may be executed by one or more processors to perform the present application. The method steps described in any of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种缺陷检测模型的压缩训练方法和装置,通过分割标注得到每个样本图像的分割标注因子矩阵,将每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,提取第一缺陷检测模型中目标卷积层输出的第一特征图以及第二缺陷检测模型中对应的目标卷积层输出的第二特征图,利用分割标注因子矩阵计算第一特征图和第二特征图的对应特征向量之间的的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数。本实施例可以提升压缩后的缺陷检测模型对产品外观微小缺陷的检测准确率。

Description

缺陷检测模型的压缩训练方法和装置
相关申请的交叉引用
本申请要求于2022年09月05日提交中国国家知识产权局的申请号为202211075557.2、名称为“缺陷检测模型的压缩训练方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及机器视觉的缺陷检测技术领域,具体而言,涉及一种缺陷检测模型的压缩训练方法和装置。
背景技术
随着图像处理和人工智能技术的发展,业内通常采用训练深度学习缺陷检测模型部署至生产线工位上的工业智能相机用于产品表面缺陷检测。由于深度学习缺陷检测模型通常网络结构复杂,计算量大,需要较高的硬件算力环境,不适用于直接部署至低算力环境的移动设备,例如手持相机等。
为了解决在低算力的移动设备中部署基于深度学习的缺陷检测模型,从而可以基于手持相机等移动设备进行产品表面缺陷的快速检测,业内通常采用对模型进行剪枝、量化、知识蒸馏等技术进行模型压缩,从而得到轻量级的深度学习缺陷检测模型进行部署和加快推理。知识蒸馏是利用大规模的教师模型的监督信息(即知识)来训练一个轻量化的学生模型,以期达到较好的性能和精度。大规模的教师模型的监督信息可以来自教师模型的输出特征知识或中间层特征知识。
但是,现实的产品外观缺陷检测工业实践中,常常面临产品外观缺陷样本数量少且缺陷尺寸微小的问题,现有的通过模型的知识蒸馏等压缩方式得到轻量化的深度学习缺陷检测模型,对于这种较少缺陷样本量下的产品外观微小缺陷的检测准确率有所下降。因此,急需一种改进的方法来解决这个问题,实现在低算力的移动设备上使用深度学习缺陷检测模型准确、快速地进行产品外观缺陷的分类检测。
发明内容
有鉴于此,本申请提出一种缺陷检测模型的压缩训练方法和装置,从而提高蒸馏压缩得到的缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,提升压缩后的缺陷检测模型对产品外观微小缺陷的检测准确率。
第一方面,本申请一实施例提出一种缺陷检测模型的压缩训练方法,包括:
对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数;
基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
在可选的实施方式中,所述分割标注因子矩阵用于标注每个样本图像中各像素点对应的因子值,其中针对每个样本图像的缺陷区域的像素点的因子值与针对每个样本图像的非缺陷区域的像素点的因子值互为相反数。
在可选的实施方式中,所述计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,包括:
计算所述第一特征图和第二特征图的对应特征向量各自的归一化向量的平方欧式距离;
计算所述平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离。
在可选的实施方式中,所述计算所述平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,包括:
将所述分割标注因子矩阵进行尺寸变换操作,得到对齐到所述第一特征图和第二特征图的尺寸大小的变换后分割标注因子矩阵;
计算所述平方欧式距离与所述变换后分割标注因子矩阵中对应的元素的乘积,从而得到所述第一特征图和第二特征图的对应特征向量之间的修正距离。
在可选的实施方式中,所述方法还包括:
在将所述每个样本图像输入所述第二缺陷检测模型后,获得所述第二缺陷检测模型输出的缺陷分类概率向量;
计算所述缺陷分类概率向量与所述样本图像的分类标注向量之间的交叉熵损失,作为第二损失函数;
计算所述第一损失函数和第二损失函数的加权和作为总损失函数,基于最小化所述总损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检 测模型。
在可选的实施方式中,所述方法还包括:针对所述样本图像数据集中的每一批次的多个样本图像,计算每个样本图像输入所述第一缺陷检测模型和第二缺陷检测模型的总损失函数的平均值,基于最小化所述总损失函数的平均值,对所述第二缺陷检测模型进行迭代训练。
在可选的实施方式中,所述方法还包括:如果所述第一特征图与第二特征图的尺寸大小不一致,则对所述第一特征图进行下采样或者对所述第二特征图进行上采样,将所述第一特征图与第二特征图的尺寸大小对齐。
第二方面,本申请另一实施例还提出一种缺陷检测模型的压缩训练方法,包括:
对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数;
基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
第三方面,本申请一实施例提出一种缺陷检测模型的压缩训练装置,包括:
分割标注单元,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
特征提取单元,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
第一损失评估单元,用于计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数;
第一迭代训练单元,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
第四方面,本申请另一实施例还提出一种缺陷检测模型的压缩训练装置,包括:
分割标注单元,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
特征提取单元,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
第一损失评估单元,用于依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数;
第一迭代训练单元,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
本申请实施例至少可以达到如下有益效果:通过对第一特征图和第二特征图在全部特征向量之间的距离以分割标注因子矩阵中的因子值进行修正,使得基于最小化第一损失函数对缺陷检测模型进行压缩训练时,提高蒸馏压缩得到的缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,并且提升压缩后的缺陷检测模型对产品外观微小缺陷的检测准确率。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍。应当理解,以下附图仅示出了本申请的某些实施例,而不应被看作是对本申请范围的限制。
图1是根据本申请一实施例的缺陷检测模型的压缩训练方法的流程示意图;
图2是根据本申请一实施例的第一缺陷检测模型ResNet101和第二缺陷检测模型ResNet18的网络结构示意图;
图3是根据本申请另一实施例的缺陷检测模型的压缩训练方法的流程示意图;
图4是根据本申请另一实施例的缺陷检测模型的压缩训练方法的流程示意图;
图5是根据本申请一实施例的缺陷检测模型的压缩训练装置的结构示意图;
图6是根据本申请另一实施例的缺陷检测模型的压缩训练装置的结构示意图;
图7是根据本申请另一实施例的缺陷检测模型的压缩训练装置的部分结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请实施例的附图,对本申请实施例中的技术方案进行清楚、完整地描述。然而应当理解,所描述的实施例仅仅是本申请的部分示例性实施例,而不是全部实施例,因此以下对本申请实施例的详细描述并非旨在限制要求保护的本申请的范围。基于本申请的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书中的术语“第一”、“第二”等仅是用于区别描述类似的对象,而不是用于描述特定的顺序或先后次序,也不能理解为指示或暗示相对重要性。
如前所述,在产品外观缺陷检测的工业实践中,常常面临产品外观缺陷样本数量少且缺陷尺寸微小的问题,现有的通过模型的知识蒸馏压缩得到轻量化的深度学习缺陷检测模型的方案,对于这种较少缺陷样本量下的产品外观微小缺陷的检测准确率有所下降。这种场景下,由于作为教师模型的预训练的第一深度学习缺陷检测模型经过较多的非缺陷图像数据集和较少的缺陷图像数据集训练,该模型对非缺陷图像的特征感知能力要强于对包含微小缺陷的缺陷图像的特征感知能力,使得该模型对包含微小缺陷的缺陷图像提取的特征整体上与对非缺陷图像提取的特征区分度并不明显。在通过该教师模型进行知识蒸馏的训练学习得到轻量化的第二深度学习缺陷检测模型的过程中,该第二深度学习缺陷检测模型学习教师模型输出的特征知识,使得第二深度学习缺陷检测模型对包含微小缺陷的缺陷图像提取的特征同样也存在上述问题,即不能与对非缺陷图像提取的特征具有显著的区分度。当经过知识蒸馏压缩后的第二深度学习缺陷检测模型部署至移动设备中并被用于对产品外观缺陷进行检测时,会影响对产品外观微小缺陷的分类检测准确性。
为此,本申请提出一种缺陷检测模型的压缩训练方法和装置,通过在缺陷检测模型的知识蒸馏的压缩训练过程中增加图像缺陷区域的分割标注因子,提高对包含微小缺陷的缺陷图像的特征感知能力,提升压缩后的缺陷检测模型对产品外观微小缺陷的检测准确率。
图1是根据本申请一实施例的缺陷检测模型的压缩训练方法的流程示意图。如图1所 示,本申请实施例的缺陷检测模型的压缩训练方法包括如下步骤:
步骤S110,对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵。
本步骤中,首选对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵,每个样本图像的分割标注因子矩阵用于标注每个样本图像中的各像素点对应的因子值,并且给每个样本图像的缺陷区域的像素点赋予与样本图像中的非缺陷区域的像素点不一样的因子值,以便在后续步骤中用于对从第一缺陷检测模型提取的第一特征图和从第二缺陷检测模型提取的第二特征图之间的距离进行修正。
在一个实施方式中,分割标注因子矩阵的尺寸大小对应于每个样本图像的像素大小,针对样本图像中的每个像素点在该分割标注因子矩阵的对应像素位置处赋予一个因子值。其中,针对缺陷区域的像素点的因子值与针对非缺陷区域的像素点的因子值互为相反数。假设每个样本图像的分割标注因子矩阵表示为A,那么,针对每个像素点(i,j),该像素点的因子值A(i,j)表示为:
其中,0<a≤1,Rd表示样本图像中的非缺陷区域的像素点集合,Rn表示样本图像中的缺陷区域的像素点集合。上述表达式的含义是对样本图像中的非缺陷区域的像素点赋予正的因子值a,对样本图像中的缺陷区域的像素点赋予负的因子值-a。本实施方式通过将样本图像中缺陷区域的像素点赋予与非缺陷区域的像素点的因子值相反的因子值,使得从第一缺陷检测模型进行第二缺陷检测模型的蒸馏学习训练时,可以增加第一损失函数中样本图像的缺陷区域对应的特征点之间的距离,提高蒸馏压缩得到的第二缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,下面会结合后续步骤进一步阐述。
步骤S120,将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型。
本步骤中,选取预训练的第一缺陷检测模型作为教师模型,随机初始化的第二缺陷检测模型作为学生模型。其中,第一缺陷检测模型是大规模的深度卷积神经网络模型,而第二缺陷检测模型与所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型,第二缺陷检测模型作为对第一缺陷检测模型进行蒸馏学习得到的压缩模型,最终部署至移动设备中,用于执行对产品外观缺陷图像的分类检测。在一个实施方式中,第一缺陷 检测模型可以选自深度残差网络模型ResNet50、ResNet101、ResNet152等,第二缺陷检测模型可以选自深度残差网络模型ResNet18。应当理解,深度残差网络模型仅是作为第一缺陷检测模型和第二缺陷检测模型的示例性的可选实施方式,本申请实施例中第一缺陷检测模型和第二缺陷检测模型不限于深度残差网络模型,其他适合于进行缺陷分类检测的深度卷积神经网络模型,如Desnet、VGG网络模型等,同样也适用于本申请的不同实施例。
在一个实施方式中,作为示例,本实施例可以选取更深层的ResNet101作为第一缺陷检测模型,浅层的ResNet18作为第二缺陷检测模型。图2示出了第一缺陷检测模型ResNet101和第二缺陷检测模型ResNet18的网络结构示意图。如图2所示,作为第一缺陷检测模型的ResNet101和作为第二缺陷检测模型的ResNet18二者具有相同的架构,即均包括五个卷积层部分。ResNet101的五个卷积层分别为第一卷积层210-1(conv1)、第二卷积层220-1(conv2_x)、第三卷积层230-1(conv3_x)、第四卷积层240-1(conv4_x)和第五卷积层250-1(conv5_x)。ResNet18的五个卷积层分别为第一卷积层210-2(conv1)、第二卷积层220-2(conv2_x)、第三卷积层230-2(conv3_x)、第四卷积层240-2(conv4_x)和第五卷积层250-2(conv5_x)。
对于第一缺陷检测模型ResNet101和第二缺陷检测模型的ResNet18,第一卷积层210-1、210-2(conv1)均为预处理层,卷积核大小为7×7,卷积核个数为64,对输入的样本图像进行预处理,输出112×112×64的特征图,112×112分别表示输出特征图的宽和高,64为输出特征图的通道数。
对于第一缺陷检测模型ResNet101,第二卷积层220-1(conv2_x)、第三卷积层230-1(conv3_x)、第四卷积层240-1(conv4_x)和第五卷积层250-1(conv5_x)分别包括3、4、23、3个卷积块,每个卷积块包括2个1×1卷积单元和1个3×3卷积单元。对于第二缺陷检测模型ResNet18,第二卷积层220-2(conv2_x)、第三卷积层230-2(conv3_x)、第四卷积层240-2(conv4_x)和第五卷积层250-2(conv5_x)分别包括2、2、2、2个卷积块,每个卷积块包括2个3×3卷积单元。依次经过各卷积层的处理,第二卷积层220-1、220-2(conv2_x)输出56×56×256的特征图,第三卷积层230-1、230-2(conv3_x)输出28×28×512的特征图,第四卷积层240-1、240-2(conv4_x)输出14×14×1024的特征图,第五卷积层250-1、250-2(conv5_x)输出7×7×2048的特征图。
经过上述五个卷积层的处理后,ResNet101和ResNet18还分别通过平均池化层260-1和260-2,全连接层270-1和270-2、softmax层280-1和280-2进行后续处理,输出样本图像数据的预测分类结果,该预测分类结果以缺陷分类概率向量的形式呈现。
本步骤中,首先将样本图像数据集中每个样本图像分别输入预训练的第一缺陷检测模型和随机初始化的第二缺陷检测模型,然后分别提取所述第一缺陷检测模型中目标卷积层 输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图。在一个实施方式中,本申请可以分别选取第一缺陷检测模型和第二缺陷检测模型中的最后一层卷积层作为目标卷积层,提取其各自输出的特征图。
假设样本图像数据集中任一样本图像表示为Is,第一缺陷检测模型的目标卷积层输出的第一特征图为M1(Is),第二缺陷检测模型的目标卷积层输出的第二特征图表示为M2(Is),M1(Is)和M2(Is)的尺寸大小为W×H×C,W为特征图的宽度,H为特征图的高度,C为特征图的通道数。
步骤S130,计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数。
本步骤中,由于深度卷积神经网络模型的卷积层输出的特征图中,每个特征点可以提取出一个对应的特征向量,该特征向量的维度即为该特征图的通道数。因此,对于第一特征图和第二特征图各自中的每个特征点的位置(m,n)而言,可以分别从第一特征图提取出该特征点对应的第一特征向量M1(Is)m,n,从第二特征图提取出该特征点对应的第二特征向量M2(Is)m,n,第一特征向量和第二特征向量构成对应的特征向量对。然后,可以计算得到该第一特征向量和第二特征向量之间的距离。
在一个实施方式中,所述第一特征向量和第二特征向量之间的距离可以是所述第一特征向量和第二特征向量各自的归一化向量的平方欧式距离。具体而言,假设第一特征向量的归一化向量表示为第二特征向量的归一化向量表示为则:

其中,||M1(Is)m,n||2和||M2(Is)m,n||2分别表示第一特征向量和第二特征向量的L2范数。
那么,可以计算得到第一特征向量和第二特征向量各自的归一化向量的平方欧式距离Em,n如下公式所示:
其中,分别表示第一特征向量和第二特征向量各自的归一化向量的第p个元素。
随后,在计算得到第一特征向量和第二特征向量各自的归一化向量的平方欧式距离之 后,计算该平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积,得到所述第一特征向量和第二特征向量之间的修正距离。
在一个实施方式中,由于分割标注因子矩阵的尺寸大小等于样本图像的像素大小,不同于第一特征图和第二特征图的宽高尺寸大小,因此,本申请实施例可以将分割标注因子矩阵进行尺寸变换操作,将分割标注因子矩阵的尺寸大小对齐到第一特征图和第二特征图的宽高尺寸大小W×H,这可以通过对分割标注因子矩阵执行最近邻插值或双线性插值的缩放操作resize()实现。以执行最近邻插值的缩放操作为例,将分割标注因子矩阵中的元素位置进行等比例的缩小操作,对应到变换后分割标注因子矩阵的目标元素位置,变换后分割标注因子矩阵的尺寸大小即变换为W×H。
相应地,计算该平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积可以包括计算该平方欧式距离与所述变换后分割标注因子矩阵中对应的元素的乘积,从而得到所述第一特征向量和第二特征向量之间的修正距离。具体而言,假设所述变换后分割标注因子矩阵表示为Ar,那么第一特征图和第二特征图各自中的特征点位置(m,n)所对应的第一特征向量和第二特征向量都可以在该变换后分割标注因子矩阵Ar中找到对应的元素Ar(m,n),该元素就是第一特征向量和第二特征向量之间距离的修正因子。因此,所述第一特征向量和第二特征向量之间的修正距离可以用以下公式表示为:
随后,将所述第一特征图和第二特征图在全部特征点位置的特征向量之间的修正距离之和作为第一损失函数Loss1(Is),即:
步骤S140,基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
本步骤中,在前述步骤得到的第一损失函数基础上,可以基于最小化该第一损失函数对所述第二缺陷检测模型进行迭代训练,在一定的学习率和批尺寸的条件下迭代更新第二缺陷检测模型的参数,最终得到合适的经过蒸馏压缩的所述第二缺陷检测模型,该压缩得到的第二缺陷检测模型可以随后部署至目标移动设备,进行产品外观图像的缺陷分类检测。
本实施例中,由于上述第一损失函数对第一特征图和第二特征图在全部特征点位置的特征向量之间的距离以分割标注因子矩阵中的分割标注因子进行了修正,而分割标注因子矩阵对样本图像中的非缺陷区域的像素点赋予了正的因子值,对样本图像中的缺陷区域的 像素点赋予与非缺陷区域的像素点的因子值相反的因子值,使得基于最小化所述第一损失函数对第二缺陷检测模型进行蒸馏学习训练时,一方面减少了第一缺陷检测模型和第二缺陷检测模型提取的非缺陷图像特征之间的距离,使得经过蒸馏压缩得到的第二缺陷检测模型对非缺陷图像提取的非缺陷图像特征尽可能与第一缺陷检测模型相似;另一方面,同时增加了第一缺陷检测模型和第二缺陷检测模型提取的缺陷图像特征之间的距离,使得经过蒸馏压缩得到的第二缺陷检测模型对缺陷图像提取的缺陷图像特征与第一缺陷检测模型具有较大差异,从而使第二缺陷检测模型对缺陷图像提取的缺陷图像特征与对非缺陷图像提取的特征具有显著的区分度,从而提高蒸馏压缩得到的第二缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,提升了压缩后的第二缺陷检测模型对产品外观微小缺陷的检测准确率。
在一个实施方式中,如果第一缺陷检测模型提取的第一特征图的尺寸与第二缺陷检测模型提取的第二特征图的尺寸大小不一致,通常表现为第一缺陷检测模型提取的第一特征图的尺寸大于第二缺陷检测模型提取的第二特征图的尺寸,则需要对第一特征图进行下采样或者对第二特征图进行上采样,将第一特征图与第二特征图的尺寸大小对齐,然后再执行上述步骤S130和S140。
图3是根据本申请另一实施例的缺陷检测模型的压缩训练方法的方法流程图。如图3所示,本申请实施例的缺陷检测模型的压缩训练方法,在前述任一实施例的基础上,可以对步骤S120和S130进行进一步的优化改进,可以得到如下步骤:
步骤S320,将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型。
本实施例步骤中,预训练的第一缺陷检测模型和随机初始化的第二缺陷检测模型可以与前述实施例的实施方式相同,在此不再赘述。
该步骤在将每个样本图像数据分别输入预训练的第一缺陷检测模型和随机初始化的第二缺陷检测模型,分别从第一缺陷检测模型和第二缺陷检测模型中分别选取多个目标卷积层。在一个实施方式中,分别选取的多个目标卷积层可以包括从第一缺陷检测模型和第二缺陷检测模型各自的多个卷积层中选择连续的若干个卷积层作为目标卷积层。继续以图2所示的第一缺陷检测模型ResNet101和第二缺陷检测模型ResNet18的网络结构为例,作为示例,例如可以从第一缺陷检测模型中选择第一卷积层210-1(conv1)和第二卷积层220-1(conv2_x),从第二缺陷检测模型中选择第一卷积层210-1(conv1)和第二卷积层220-1 (conv2_x)分别作为对应的目标卷积层,或者从第一缺陷检测模型中选择第四卷积层240-1(conv4_x)和第五卷积层250-1(conv5_x),从第二缺陷检测模型中选择第四卷积层240-2(conv4_x)和第五卷积层250-2(conv5_x)分别作为对应的目标卷积层,等等。如此,可以分别从第一缺陷检测模型和第二缺陷检测模型中的多个目标卷积层提取到多个对应的第一特征图和第二特征图。
步骤S330,依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数。
具体而言,假设从第一缺陷检测模型和第二缺陷检测模型中分别选择L个目标卷积层,分别从第一缺陷检测模型和第二缺陷检测模型的各目标卷积层提取输出的L个第一特征图和对应的L个第二特征图,L为大于1的整数。那么,对于第l个第一特征图和对应的第二特征图,0<l≤L,第l个第一特征图和对应的第二特征图之间在特征点位置(m,n)所对应的第一特征向量和第二特征向量各自的归一化向量的平方欧式距离为第一特征向量和第二特征向量之间的修正距离表示为即用该平方欧式距离与所述分割标注因子矩阵对应于第l个第一特征图和对应的第二特征图的尺寸变换矩阵中对应的元素的乘积来表征该修正距离,用以下公式计算得到:其中,Ar,l(m,n)是所述分割标注因子矩阵对应于第l个第一特征图和对应的第二特征图的尺寸变换矩阵中对应于特征点位置(m,n)的元素。此外,由于多个目标卷积层输出特征图的尺寸大小并不相同,所以分割标注因子矩阵需要针对每个第一特征图进行对应的尺寸变换操作,将分割标注因子矩阵的尺寸大小分别对齐到各个第一特征图和第二特征图的宽高尺寸大小。
那么,将所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数,可以通过如下公式计算得到:
其中,Wl和Hl分别表示第l个第一特征图和对应的第二特征图的宽高尺寸。
本实施例通过对第一缺陷检测模型和第二缺陷检测模型中多个目标卷积层提取的多个第一特征图和对应的多个第二特征图之间的修正距离进行累加,可以综合考虑第一缺陷检测模型和第二缺陷检测模型中多个中间卷积层的特征提取特性,可以更加有利于第一缺陷 检测模型和第二缺陷检测模型之间的蒸馏学习,使得基于最小化所述第一损失函数对第二缺陷检测模型进行蒸馏学习训练时,在减少了第一缺陷检测模型和第二缺陷检测模型提取的非缺陷图像特征之间的距离的同时,增加了第一缺陷检测模型和第二缺陷检测模型提取的缺陷图像特征之间的距离,使第二缺陷检测模型对缺陷图像提取的缺陷图像特征与对非缺陷图像提取的特征具有显著的区分度,从而进一步提高蒸馏压缩得到的第二缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,进一步提升压缩后的第二缺陷检测模型对产品外观微小缺陷的检测准确率。
在一些实施方式中,如图4所示,本申请实施例所述方法还可以包括:
步骤S410,在将所述每个样本图像输入所述第二缺陷检测模型后,获得所述第二缺陷检测模型输出的缺陷分类概率向量;
步骤S420,计算所述缺陷分类概率向量与所述样本图像数据的分类标注向量之间的交叉熵损失,作为第二损失函数;
步骤S430,计算所述第一损失函数和第二损失函数的加权和作为总损失函数,基于最小化所述总损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
本实施例中,在对第二缺陷检测模型进行蒸馏学习训练时,同时获得所述第二缺陷检测模型输出的缺陷分类概率向量。缺陷分类概率向量可以是经过如图2所示的softmax层280-2输出的概率向量[c1,c2,...,cK],其中,K为由非缺陷图像分类和多个缺陷图像分类构成的分类数量,该概率向量表示每个样本图像的预测分类概率。将每个样本图像的所述缺陷分类概率向量与所述样本图像数据的分类标注向量(分类真实值)之间的交叉熵损失,作为第二损失函数Loss2(Is)。然后,求取第一损失函数和第二损失函数的加权和作为总损失函数,即Losstotal(Is)=Loss1(Is)+αLoss2(Is),α为第一损失函数和第二损失函数的权重系数,可以根据训练过程中的经验值调节。随后,可以基于最小化所述第一损失函数和第二损失函数的加权和,对所述第二缺陷检测模型进行迭代训练,更新第二缺陷检测模型的参数,从而得到经过蒸馏压缩的所述第二缺陷检测模型。
本实施例在第二缺陷检测模型的蒸馏学习训练中,在前述第一损失函数基础上,进一步考虑第二缺陷检测模型本身的预测损失,可以辅助性提升经过蒸馏学习后的第二缺陷检测模型对产品外观微小缺陷的预测准确率。
在一个实施方式,所述方法还包括:
针对所述样本图像数据集中的每一批次的多个样本图像,计算每个样本图像输入所述第一缺陷检测模型和第二缺陷检测模型的总损失函数的平均值,基于最小化所述总损失函数的平均值,对所述第二缺陷检测模型进行迭代训练。
假设模型训练的批尺寸为N,则针对每一批次的多个样本图像{I1,I2,...,IN},依次输入所述第一缺陷检测模型和第二缺陷检测模型进行训练,可以计算得到每一批次的总损失函数的平均值为:
如此,可以基于最小化每一批次的所述总损失函数的平均值Lossavg,对所述第二缺陷检测模型进行迭代训练,更新第二缺陷检测模型的参数,从而得到经过蒸馏压缩的所述第二缺陷检测模型。
图5是根据本申请一实施例的缺陷检测模型的压缩训练装置的结构示意图。如图5所示,本申请实施例的缺陷检测模型的压缩训练装置包括如下模块单元:
分割标注单元510,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵。
特征提取单元520,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型。
第一损失评估单元530,用于计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数。
第一迭代训练单元540,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
图6是根据本申请另一实施例的缺陷检测模型的压缩训练装置的结构示意图。如图6所示,本申请实施例的缺陷检测模型的压缩训练装置包括如下模块单元:
分割标注单元610,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵。
特征提取单元620,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型。
第一损失评估单元630,用于依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数。
第一迭代训练单元640,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
在一个实施方式中,如图7所示,本申请一实施例还可以包括:
概率向量获取单元710,用于在将所述每个样本图像输入所述第二缺陷检测模型后,获得所述第二缺陷检测模型输出的缺陷分类概率向量。
第二损失评估单元720,用于计算所述缺陷分类概率向量与所述样本图像的分类标注向量之间的交叉熵损失,作为第二损失函数。
第二迭代训练单元730,用于计算所述第一损失函数和第二损失函数的加权和作为总损失函数,基于最小化所述总损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
在一个实施方式,所述装置还包括:
第三迭代训练单元,用于针对所述样本图像数据集中的每一批次的多个样本图像,计算每个样本图像输入所述第一缺陷检测模型和第二缺陷检测模型的总损失函数的平均值,基于最小化所述总损失函数的平均值,对所述第二缺陷检测模型进行迭代训练。
需要说明的是,本领域技术人员可以理解,本申请的方法实施例所描述的不同实施方式及其说明解释和所达到的技术效果,同样适用于本申请的装置实施例中,在此不再赘述。
本申请实施例通过在深度学习缺陷检测模型的知识蒸馏的压缩训练过程中增加图像缺陷区域的分割标注因子,提高经过蒸馏压缩后的深度学习缺陷检测模型对包含微小缺陷的缺陷图像的特征感知能力,提升压缩后的深度学习缺陷检测模型对产品外观微小缺陷的检测准确率。
本申请可以通过软件、硬件或软硬件结合的方式实施。当实现为计算机软件程序时,该计算机软件程序可以安装于计算装置的存储器中被一个或多个处理器执行以实现相应功能。
进一步地,本申请实施例还可以包括一种计算机可读介质,该计算机可读介质存储有程序指令,在这样的实施例中,当该计算机可读存储介质被装载在计算装置中时,该程序指令可以被一个或多个处理器执行以执行本申请任一实施例中描述的方法步骤。
进一步地,本申请的实施例还可以包括一种计算机程序产品,包括承载程序指令的计算机可读介质,在这样的实施例中,该程序指令可以被一个或多个处理器执行以执行本申请任一实施例中描述的方法步骤。
以上描述了本申请示例性的实施例,应当理解,上述示例性的实施例不是限制性的,而是说明性的,本申请的保护范围不限于此。应理解,本领域技术人员在不脱离本申请的精神和范围的情况下,可以对本申请实施例进行修改和变型,这些修改和变型理应在本申请的保护范围之内。

Claims (10)

  1. 一种缺陷检测模型的压缩训练方法,其特征在于,包括:
    对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
    将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
    计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数;
    基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
  2. 根据权利要求1所述的缺陷检测模型的压缩训练方法,其特征在于,所述分割标注因子矩阵用于标注每个样本图像中各像素点对应的因子值,其中针对每个样本图像的缺陷区域的像素点的因子值与针对每个样本图像的非缺陷区域的像素点的因子值互为相反数。
  3. 根据权利要求2所述的缺陷检测模型的压缩训练方法,其特征在于,所述计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,包括:
    计算所述第一特征图和第二特征图的对应特征向量各自的归一化向量的平方欧式距离;
    计算所述平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离。
  4. 根据权利要求3所述的缺陷检测模型的压缩训练方法,其特征在于,所述计算所述平方欧式距离与所述分割标注因子矩阵中对应的元素的乘积,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,包括:
    将所述分割标注因子矩阵进行尺寸变换操作,得到对齐到所述第一特征图和第二特征图的尺寸大小的变换后分割标注因子矩阵;
    计算所述平方欧式距离与所述变换后分割标注因子矩阵中对应的元素的乘积,从而得到所述第一特征图和第二特征图的对应特征向量之间的修正距离。
  5. 根据权利要求4所述的缺陷检测模型的压缩训练方法,其特征在于,所述方法还包括:
    在将所述每个样本图像输入所述第二缺陷检测模型后,获得所述第二缺陷检测模型输出的缺陷分类概率向量;
    计算所述缺陷分类概率向量与所述样本图像的分类标注向量之间的交叉熵损失,作为第二损失函数;
    计算所述第一损失函数和第二损失函数的加权和作为总损失函数,基于最小化所述总损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
  6. 根据权利要求5所述的缺陷检测模型的压缩训练方法,其特征在于,所述方法还包括:针对所述样本图像数据集中的每一批次的多个样本图像,计算每个样本图像输入所述第一缺陷检测模型和第二缺陷检测模型的总损失函数的平均值,基于最小化所述总损失函数的平均值,对所述第二缺陷检测模型进行迭代训练。
  7. 根据权利要求6所述的缺陷检测模型的压缩训练方法,其特征在于,所述方法还包括:如果所述第一特征图与第二特征图的尺寸大小不一致,则对所述第一特征图进行下采样或者对所述第二特征图进行上采样,将所述第一特征图与第二特征图的尺寸大小对齐。
  8. 一种缺陷检测模型的压缩训练方法,其特征在于,包括:
    对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
    将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
    依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数;
    基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
  9. 一种缺陷检测模型的压缩训练装置,其特征在于,包括:
    分割标注单元,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
    特征提取单元,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中目标卷积层输出的第一特征图以及所述第二缺陷检测模型中对应的目标卷积层输出的第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
    第一损失评估单元,用于计算所述第一特征图和第二特征图的对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到所述第一特征图和第二特征图的对应特征向量之间的修正距离,并计算所述第一特征图和第二特征图的全部特征向量之间的修正距离之和作为第一损失函数;
    第一迭代训练单元,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
  10. 一种缺陷检测模型的压缩训练装置,其特征在于,包括:
    分割标注单元,用于对产品外观的样本图像数据集进行缺陷区域的分割标注,得到每个样本图像的分割标注因子矩阵;
    特征提取单元,用于将所述样本图像数据集中每个样本图像分别输入第一缺陷检测模型和第二缺陷检测模型,分别提取所述第一缺陷检测模型中多个目标卷积层输出的多个第一特征图以及所述第二缺陷检测模型中对应的多个目标卷积层输出的多个第二特征图,其中所述第二缺陷检测模型是与预训练的所述第一缺陷检测模型属于相同架构但更轻量化的深度卷积神经网络模型;
    第一损失评估单元,用于依次计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图在对应特征向量之间的距离,用所述分割标注因子矩阵中对应的元素对所述距离进行修正,得到每个第一特征图和对应的第二特征图在对应特征向量之间的修正距离,计算每个第一特征图和对应的第二特征图在全部特征向量之间的修正距离之和,并计算所述多个第一特征图和第二特征图中每个第一特征图和对应的第二特征图的修正距离之和的累加作为第一损失函数;
    第一迭代训练单元,用于基于最小化所述第一损失函数,对所述第二缺陷检测模型进行迭代训练,得到经过蒸馏压缩的所述第二缺陷检测模型。
PCT/CN2023/116994 2022-09-05 2023-09-05 缺陷检测模型的压缩训练方法和装置 WO2024051686A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211075557.2A CN115147418B (zh) 2022-09-05 2022-09-05 缺陷检测模型的压缩训练方法和装置
CN202211075557.2 2022-09-05

Publications (1)

Publication Number Publication Date
WO2024051686A1 true WO2024051686A1 (zh) 2024-03-14

Family

ID=83415533

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/116994 WO2024051686A1 (zh) 2022-09-05 2023-09-05 缺陷检测模型的压缩训练方法和装置

Country Status (2)

Country Link
CN (1) CN115147418B (zh)
WO (1) WO2024051686A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115147418B (zh) * 2022-09-05 2022-12-27 东声(苏州)智能科技有限公司 缺陷检测模型的压缩训练方法和装置
CN116503694B (zh) * 2023-06-28 2023-12-08 宁德时代新能源科技股份有限公司 模型训练方法、图像分割方法、装置和计算机设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191489A (zh) * 2021-04-30 2021-07-30 华为技术有限公司 二值神经网络模型的训练方法、图像处理方法和装置
CN113408571A (zh) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 一种基于模型蒸馏的图像分类方法、装置、存储介质及终端
CN114565045A (zh) * 2022-03-01 2022-05-31 北京航空航天大学 一种基于特征分离注意力的遥感目标检测知识蒸馏方法
CN114708270A (zh) * 2021-12-15 2022-07-05 华东师范大学 基于知识聚合与解耦蒸馏的语义分割模型压缩系统及压缩方法
CN114842449A (zh) * 2022-05-10 2022-08-02 安徽蔚来智驾科技有限公司 目标检测方法、电子设备、介质及车辆
CN115147418A (zh) * 2022-09-05 2022-10-04 东声(苏州)智能科技有限公司 缺陷检测模型的压缩训练方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102166458B1 (ko) * 2018-12-28 2020-10-15 이화여자대학교 산학협력단 인공신경망 기반의 영상 분할을 이용한 불량 검출 방법 및 불량 검출 장치
CN111768388B (zh) * 2020-07-01 2023-08-11 哈尔滨工业大学(深圳) 一种基于正样本参考的产品表面缺陷检测方法及系统
CN112381763A (zh) * 2020-10-23 2021-02-19 西安科锐盛创新科技有限公司 一种表面缺陷检测方法
CN114049332A (zh) * 2021-11-16 2022-02-15 上海商汤智能科技有限公司 异常检测方法及装置、电子设备和存储介质
CN114299034A (zh) * 2021-12-30 2022-04-08 杭州海康威视数字技术股份有限公司 一种缺陷检测模型的训练方法、缺陷检测方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191489A (zh) * 2021-04-30 2021-07-30 华为技术有限公司 二值神经网络模型的训练方法、图像处理方法和装置
CN113408571A (zh) * 2021-05-08 2021-09-17 浙江智慧视频安防创新中心有限公司 一种基于模型蒸馏的图像分类方法、装置、存储介质及终端
CN114708270A (zh) * 2021-12-15 2022-07-05 华东师范大学 基于知识聚合与解耦蒸馏的语义分割模型压缩系统及压缩方法
CN114565045A (zh) * 2022-03-01 2022-05-31 北京航空航天大学 一种基于特征分离注意力的遥感目标检测知识蒸馏方法
CN114842449A (zh) * 2022-05-10 2022-08-02 安徽蔚来智驾科技有限公司 目标检测方法、电子设备、介质及车辆
CN115147418A (zh) * 2022-09-05 2022-10-04 东声(苏州)智能科技有限公司 缺陷检测模型的压缩训练方法和装置

Also Published As

Publication number Publication date
CN115147418A (zh) 2022-10-04
CN115147418B (zh) 2022-12-27

Similar Documents

Publication Publication Date Title
WO2024051686A1 (zh) 缺陷检测模型的压缩训练方法和装置
CN109683360B (zh) 液晶面板缺陷检测方法及装置
CN109118473B (zh) 基于神经网络的角点检测方法、存储介质与图像处理系统
CN111709909A (zh) 基于深度学习的通用印刷缺陷检测方法及其模型
CN111461113B (zh) 一种基于变形平面物体检测网络的大角度车牌检测方法
CN112818969A (zh) 一种基于知识蒸馏的人脸姿态估计方法及系统
US11348349B2 (en) Training data increment method, electronic apparatus and computer-readable medium
CN112258470B (zh) 基于缺陷检测的工业图像临界压缩率智能分析系统及方法
CN115861190A (zh) 一种基于对比学习的光伏组件无监督缺陷检测方法
CN114255212A (zh) 一种基于cnn的fpc表面缺陷检测方法及其系统
CN113780423A (zh) 一种基于多尺度融合的单阶段目标检测神经网络及工业品表面缺陷检测模型
CN114972316A (zh) 基于改进YOLOv5的电池壳端面缺陷实时检测方法
CN115240259A (zh) 一种基于yolo深度网络的课堂环境下人脸检测方法及其检测系统
CN115631411A (zh) 基于sten网络对不同环境中的绝缘子破损检测方法
CN115410059A (zh) 基于对比损失的遥感图像部分监督变化检测方法及设备
CN110991563A (zh) 一种基于特征融合的胶囊网络随机路由算法
CN111539931A (zh) 一种基于卷积神经网络和边界限定优化的外观异常检测方法
CN116363105A (zh) 一种基于Faster R-CNN的高铁接触网零部件识别与定位的方法
CN115511820A (zh) 一种柔体线路板缺陷检测模型训练方法及缺陷检测方法
CN115100068A (zh) 一种红外图像校正方法
CN114581722A (zh) 一种基于孪生残差网络的二阶段多分类工业图像缺陷检测方法
CN113487571A (zh) 一种基于图像质量评价的自监督异常检测方法
Xie et al. Defect detection of printed circuit board based on small target recognition network
CN111524119A (zh) 基于深度学习的二维码缺陷检测方法
CN110826564A (zh) 一种复杂场景图像中的小目标语义分割方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23862380

Country of ref document: EP

Kind code of ref document: A1