CN114417993A

CN114417993A - Scratch detection method based on deep convolutional neural network and image segmentation

Info

Publication number: CN114417993A
Application number: CN202210055893.4A
Authority: CN
Inventors: 周富强; 杨乐淼
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2022-04-29

Abstract

The invention discloses a scratch detection method based on a deep convolutional neural network and image segmentation, which comprises the steps of designing a multi-feature fusion module according to the morphological characteristics of scratches, adding the multi-feature fusion module into a deep convolutional neural network structure, identifying and positioning the scratches, and extracting a scratch prediction frame image; and calculating a principal component point of the scratch prediction frame image by using a principal component analysis method, and performing 8-neighborhood region growth on the self-adaptive mean value segmented scratch prediction frame image by using the principal component point as a growth point to realize accurate segmentation of scratch pixels. The invention can carry out high-precision identification, positioning and segmentation on the scratch and provides convenience for the subsequent precision measurement of the scratch size.

Description

Scratch detection method based on deep convolutional neural network and image segmentation

Technical Field

The invention belongs to the field of automatic defect detection and precision measurement, and particularly relates to a scratch detection method based on a deep convolutional neural network and image segmentation.

Background

The improvement of the modern industrial intelligence level puts higher and higher requirements on the surface quality of products. The surface quality problem of the product usually appears in the local areas with uneven physical or chemical properties on the surface of the product, such as scratches of key parts of automobiles and trains, welding seams of laser welding products, concrete cracks, chip packaging scratches and the like, the defects can be classified into the scratches according to morphological characteristics of the defects, and the effective detection of the surface scratches has important significance for ensuring the quality and performance of the product, improving the yield and ensuring the use safety. The machine vision technology is a non-contact and non-destructive automatic detection technology, has the outstanding advantages of safety, reliability, wide spectral response range, capability of working for a long time in a severe environment and the like, and is a research direction in the field at the present stage when the machine vision technology is applied to automatic scratch detection.

However, the scratch image in the industrial field often contains a lot of noise, and the lighting conditions are complex and diverse, resulting in large gray scale change of the image and low contrast of the scratch and the background, which causes great difficulty in ensuring the detection accuracy and speed for the scratch detection task with complex surface structure and large detection amount to be detected. In addition, in the field of precision manufacturing of aerospace, military weapons, semiconductors, microelectronics and the like, the automatic scratch detection task comprises scratch identification and positioning, namely identifying whether a scratch exists on the surface to be detected and determining the position coordinate of the scratch, pixel-by-pixel segmentation is also required to be carried out on the scratch, the accurate size of the scratch is further measured, and stricter and rigorous requirements are provided for the automatic scratch detection task.

The traditional scratch detection method based on machine vision usually depends on the characteristics of manual design, and needs to optimally design algorithms according to actual detection requirements and application scenes, and the algorithms focus on segmentation and extraction of scratches, neglect identification of the scratches, and are easily interfered by noise or other types of defects. In recent years, the defect of the traditional method in the aspect of scratch identification is made up by the increasingly emerging scratch detection method based on deep learning, the method needs training and learning of a large amount of comparison labeling data, and focuses on the identification and positioning of scratches, but the requirement of precise detection on the accuracy of scratch pixel segmentation is difficult to meet. Therefore, it is of practical significance to research a detection method capable of identifying, positioning and accurately segmenting scratches.

Disclosure of Invention

The invention solves the technical problems that: the method is characterized in that a multi-feature fusion module designed according to scratch morphological characteristics is added on the basis of an SSD (solid State disk) network, a principal component growth segmentation algorithm is designed aiming at an extracted scratch prediction frame, accurate segmentation is carried out on scratch pixels while background noise is effectively inhibited, and the method is convenient for identifying, positioning and accurately segmenting low-contrast and small-size scratches and providing convenience for subsequent precise measurement of scratch sizes; on the premise of ensuring higher detection speed, the scratch recognition precision is higher than that of the mainstream target recognition method at the present stage, and the scratch segmentation effect is better than that of the mainstream image segmentation method at the present stage on the scratch data sets under 2 different application scenes.

The technical scheme of the invention is as follows: a scratch detection method based on a deep convolutional neural network and image segmentation comprises the following steps:

s1: according to the morphological characteristics of scratches, a multi-feature fusion module composed of an upper sampling layer, a transverse connection layer and a feature fusion layer is designed, the multi-feature fusion module performs upsampling on high-level features to obtain the same scale as low-level features, then performs fusion through the transverse connection layer, respectively inputs 6 layers of feature layers used for prediction in a single-excitation multi-box detector SSD network into the multi-feature fusion module, and the feature layers output by the multi-feature fusion module are used for prediction in step S2;

s2: constructing a loss function of the category and the position, performing regression analysis on the feature layer output by the multi-feature fusion module by using the loss function of the category and the position, identifying the scratch based on the loss function, calculating coordinates of a scratch prediction frame, and outputting an image of the scratch prediction frame;

s3: calculating the scratch direction in the scratch prediction frame image by using a principal component analysis method, and calculating an average value point and a starting point from the scratch direction to be used as a region growth seed point;

s4: and performing self-adaptive threshold segmentation on the scratch prediction frame image, performing 8-neighborhood region growth on pixel points in the scratch prediction frame image subjected to self-adaptive threshold segmentation, and restoring the growth point set to the position of the original image prediction frame to obtain a scratch segmentation result.

The specific operation steps of step S1 include:

s101: an up-sampling layer in the multi-feature fusion module performs deconvolution operation on the last feature layer used for prediction in the SSD network and the 2 nd, 3 rd, 4 th and 5 th feature layers in the fusion layer, and the sizes of the 2 nd, 3 th, 4 th and 5 th feature layers in the last feature layer used for prediction and the fusion layer are expanded to 2 times of the sizes of the original feature layers;

s102: the transverse connection layer in the multi-feature fusion module performs convolution operation on the front 5-layer feature layer used for prediction in the SSD network, so that the number of channels of the front 5-layer feature layer used for prediction in the SSD network is reduced to 1/2, and the channels are kept consistent with the number of the feature map channels subjected to up-sampling so as to meet the fusion condition;

s103: tensor calculation is carried out on the feature images with the same size subjected to up-sampling and transverse connection by the fusion layer in the multi-feature fusion module, 5 layers of feature layers used for prediction and containing deep layer and shallow layer feature information are obtained, the last layer of feature layer used for prediction in the SSD network is still reserved as the feature layer used for prediction in the 6 th layer, and the 6 layers of feature layers output in the multi-feature fusion module are used for prediction in the step S2.

The specific operation steps of step S2 include:

s201: calculating a category loss function and a position loss function, and extracting and screening a prior frame according to the loss function, wherein the calculation formulas of the category loss function and the position loss function are as follows:

wherein, in the class loss function,

showing whether the ith prior frame is matched with the real object frame of the jth target with the pth characteristic or not, if so

Taking 1 indicates a match, taking 0 indicates a mismatch,

representing the ith prior box and the p-th feature,

representing that the ith prior frame is not matched with any feature, N is the matching number of the scratch real object frame and the prior frame, pos is a positive sample prior frame, and neg is a negative sample prior frame; in the function of the position loss,

is the deviation between the ith a priori box coordinates and the vth a priori box coordinates,

representing the deviation, Smooth, between the jth scratch real object frame and the v-th prior frame_L1For the L1 regularization function, c_x,c_yW, h are the center coordinate, width and height of the prior frame, respectively;

s202: according to the center coordinate, the width and the height of the screened prior frame, a scratch prediction frame and the coordinate position thereof are obtained by decoding, and the width and the height of the input image and the scratch marking object frame are respectively set as w_i,h_i,w_o,h_oThen, the coordinates of the scratch prediction box are:

s203: and clipping the input image to obtain a scratch prediction frame image.

The specific operation steps of step S3 include:

s301: binarizing the scratch prediction frame image, and setting the pixel of the binarized scratch as X ═ X⁽¹⁾,x⁽²⁾,...,x⁽ⁿ⁾) Each pixel point is represented according to its 2-dimensional coordinates as:

in order to predict the scratch direction, 2-dimensional data is mapped to a 1-dimensional space, and a projection transformation matrix of the mapping is set as W, so that the obtained projection is:

z⁽ⁱ⁾＝W^Tx⁽ⁱ⁾,i＝1,2,...,n (5)

to preserve as much of the original data information as possible, the sum of the variances of the projection data is used

Maximum, i.e., the trace that maximizes the sum of variance of the projection data:

wherein I is an identity matrix, the solution is carried out by utilizing a Lagrangian multiplier method, and the Lagrangian function is set as follows:

J(W)＝tr(W^TXX^TW+λ(W^TW-I)) (7)

wherein λ is lagrangian factor, and the derivation of W is:

XX^TW＝(-λ)W (8)

as can be seen from equation (8), W is the covariance matrix XX^TIs- λ is XX^TA matrix of eigenvalues of;

s302: computing sample covariance matrix XX^TGet the maximumThe corresponding characteristic vector of the characteristic value is marked as the scratch direction;

s303: computing sample centers

And calculating a binarization scratch pixel at the leftmost end in the scratch direction from the scratch direction as an average point, and taking the average point and the initial point as region growth seed points.

The specific operation steps of step S4 include:

s401: self-adaptive threshold segmentation is carried out on the scratch prediction frame image, the gray average value of a neighborhood window with a pixel as the center is used as a segmentation threshold value, so that the influence of uneven illumination and noise on the segmentation is reduced, and a target scratch pixel point set A is obtained;

s402: growing 8 neighborhood regions of the points in the point set, and considering whether 8 neighborhood pixels P of the seed points meet the growth criterion R_g(P)：

If R is_gIf P is 1, pressing P into the seed point stack and recording a growth point set S, and judging the growth criterion of all pixels in the stack until the stack is empty, wherein the growth point set S is the divided scratch pixel;

s403: and restoring the divided scratch pixels to the position of the original image prediction frame to obtain a scratch division result.

Compared with the prior art, the invention has the advantages that:

(1) the invention combines the deep learning network with the image segmentation algorithm to analyze and segment the scratch from qualitative to quantitative, not only utilizes the strong target identification capability of deep learning to inhibit the interference of background noise, but also keeps the accuracy of the image segmentation algorithm based on the manual design characteristics to segment the scratch. The invention is compared with the existing traditional segmentation technology and the current mainstream semantic segmentation network respectively, and comprises an Ostu threshold segmentation method, a Sobel edge detection method, a Gabor filter method, a U-Net, a PSPNet, a DeepLabv3 and other semantic segmentation networks, wherein the experimental results on two data sets are respectively shown in tables 1 and 2, wherein Precision and Recall respectively measure the proportion of correctly predicted scratch pixels in a predicted picture and a labeled picture, and the Precision and Recall need to be comprehensively analyzed to show the integral segmentation effect. mIoU and mPA comprehensively describe a prediction picture and a label picture, and the segmentation accuracy is comprehensively measured.

TABLE 1 comparison of scratch segmentation results for electronic commutator scratch data sets

TABLE 2 scratch segmentation result comparison of weld X-ray image datasets

In the segmentation method based on digital image processing, the Ostu threshold segmentation method is easily interfered by non-interested regions or non-scratch defects, so that the detection accuracy is low. The Sobel edge detection method has strong adaptability to changing brightness and noise, has better effect of eliminating noise interference than the Ostu threshold segmentation method, but still cannot filter some background noise, and has inaccurate positioning on scratch edges, and the accuracy is lower than that of the Sobel edge detection method. The Gabor filter method has good detection effect in an electronic commutator scratch data set due to high contrast of background and scratches, but in an image of a welding seam X-ray image data set, the edge of the image has obvious illumination change interference, and scratches cannot be effectively extracted from the interference.

In the semantic segmentation method based on deep learning, a large number of channels are arranged in the U-Net upsampling process, and context information of a high-resolution feature layer is effectively utilized, so that the detection effect on fine scratches is good. The PSPNet and DeepLabv3 have larger receptive fields, are more suitable for tasks concerning global information such as scene analysis, and extract scratch local information inaccurately.

According to the invention, only the image of the prediction frame is segmented, and then the image of the prediction frame is restored to the position of the original image, so that the scratch details are accurately extracted while the interference of background information is effectively avoided, and the advantage of the idea of combining target detection and image segmentation is embodied.

(2) In order to effectively identify the scratch, a scratch identification network is designed, a multi-feature fusion module designed according to the morphological characteristics of the scratch is added on the basis of an SSD network, and the precision is higher than that of the mainstream target identification method at the present stage on the premise of ensuring higher detection speed. The scratch recognition network of the present invention is compared with other mainstream target recognition methods, including two stages of Faster R-CNN and R-FCN, and one stage of YOLOv4, SSD, RetinaNet, and the experimental results on the two data sets are shown in Table 3 and Table 4, respectively.

TABLE 3 comparison of scratch recognition results for electronic commutator scratch data sets

TABLE 4 scratch segmentation result comparison of weld X-ray image dataset

In the two-stage method, although the detection precision of the Faster R-CNN is not much different from that of the invention, the detection time is 2-5.5 times of that of the method. R-FCN is an improvement on Faster R-CNN, and on the scratch data set in the experiment, the detection speed of R-FCN is improved but the detection precision is reduced compared with that of Faster R-CNN, and the speed and the precision are lower than those of the invention. The detection speed of the one-stage method is higher than that of the two-stage method, wherein the detection speeds of YOLOv4 and RetinaNet are not different from the detection speed of the invention, but the detection precision is lower than that of the invention. The SSD is an improved prototype of the invention, and the detection speed is slightly reduced but the precision is obviously improved due to the increase of the complexity of a network structure. On the whole, compared with the mainstream target detection network, the method has the highest detection precision on the premise of ensuring higher detection speed.

(3) And a principal component growth segmentation algorithm is designed aiming at the extracted scratch prediction box, so that the integrity and the accuracy of scratch extraction are ensured while background noise is effectively suppressed. The principal component growth segmentation algorithm in the invention is respectively compared with the existing traditional segmentation technology and the existing mainstream semantic segmentation network, including an Ostu threshold segmentation method, a Sobel edge detection method, a Gabor filter method and semantic segmentation networks such as U-Net, PSPNet, DeepLabv3, and the experimental results on the two data sets are respectively shown in Table 5 and Table 6.

TABLE 5 comparison of scratch prediction box segmentation results for electronic commutator scratch data sets

TABLE 6 scratch segmentation comparison of weld X-ray image datasets

Because the scratch image in the welding seam X-ray image data set is represented by X-rays, the contrast of the scratch relative to the background is low, the illumination change amplitude is large, and the segmentation difficulty is larger than that of the scratch data set of the electronic commutator. A segmentation algorithm based on digital image processing is sensitive to illumination change, background noise is difficult to inhibit, particularly, the image detection effect on the welding seam X-ray image data set with obviously uneven brightness is poor, and quantitative indexes mIoU and mPA of the segmentation algorithm are lower than those of the method. The semantic segmentation method based on deep learning is better in background suppression effect than a segmentation algorithm based on digital image processing due to pixel-by-pixel marking and training of scratches, but the segmentation accuracy and integrity on certain low-contrast scratches and fine scratches are insufficient, and the quantitative indexes mIoU and mPA of the method are lower than those of the method. According to the principal component growing and dividing algorithm, the principal component points of the scratch prediction frame image are used for carrying out region growing, so that the interference of background noise is effectively eliminated, and the completeness of scratch extraction is guaranteed.

Drawings

FIG. 1 is a flow chart of a scratch detection method based on a deep convolutional neural network and image segmentation according to the present invention;

FIG. 2 is a diagram of a scratch recognition positioning network according to the present invention;

FIG. 3 is a block diagram of a multi-feature fusion module according to the present invention;

FIG. 4 is a flow chart of the principal component growing segmentation algorithm of the present invention.

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings.

As shown in fig. 1, the scratch detection method based on the deep convolutional neural network and the image segmentation of the present invention specifically includes the following steps:

s1: a scratch identification and positioning network structure diagram is shown in fig. 2, a multi-feature fusion module composed of an up-sampling layer, a transverse connection layer and a feature fusion layer is designed according to the morphological characteristics of scratches, the multi-feature fusion module obtains the same scale as the low-level features by up-sampling the high-level features, then the multi-feature fusion module performs fusion through the transverse connection layer, 6 layers of feature layers for prediction in the SSD network are respectively input into 6 multi-feature fusion modules, the feature layers output by the multi-feature fusion modules are used for prediction in step S2, and fig. 3 is a multi-feature fusion module structure diagram;

specifically, S101: an up-sampling layer in the multi-feature fusion module performs deconvolution operation on the last feature layer used for prediction in the SSD network and the 2 nd, 3 rd, 4 th and 5 th feature layers in the fusion layer, and the sizes of the 2 nd, 3 th, 4 th and 5 th feature layers in the last feature layer used for prediction and the fusion layer are expanded to 2 times of the sizes of the original feature layers;

s103: tensor calculation is carried out on feature images with the same size subjected to up-sampling and transverse connection by a fusion layer in the multi-feature fusion module, 5 layers of feature layers used for prediction and containing deep layer and shallow layer feature information are obtained, the last layer of feature layer used for prediction in the SSD network is still reserved as the 6 th layer of feature layer used for prediction, the multi-feature fusion module designed according to scratch morphological features is added in the SSD network, and the detection capability of the SSD network on scratches with small size and low contrast is improved;

s2: and constructing a loss function of the category and the position, performing regression analysis on the feature layer output by the multi-feature fusion module by using the loss function of the category and the position, identifying the scratch based on the category loss function, calculating coordinates of a scratch prediction frame, and outputting an image of the scratch prediction frame.

Specifically, S201: calculating a category loss function and a position loss function, extracting and screening prior frames according to the loss function, wherein the calculation formulas of the category loss function and the position loss function are

Wherein, in the class loss function,

Taking 1 indicates a match, taking 0 indicates a mismatch,

representing the ith prior box and the p-th feature,

indicates that the ith prior box does not match any bitsAnd (4) sign matching, wherein N is the matching number of the scratch real object frame and the prior frame, pos is a positive sample prior frame, and neg is a negative sample prior frame. In the function of the position loss,

representing the deviation, Smooth, between the jth scratch real object frame and the v-th prior frame_L1For the L1 regularization function, c_x,c_yAnd w, h are the center coordinates, width and height of the prior frame, respectively.

S202: according to the center coordinate, the width and the height of the screened prior frame, a scratch prediction frame and the coordinate position thereof are obtained by decoding, and the width and the height of the input image and the scratch marking object frame are respectively set as w_i,h_i,w_o,h_oThen the coordinates of the scratch prediction box are

S203: and clipping the input image to obtain a scratch prediction frame image.

S3: the flow chart of the principal component growth segmentation algorithm is shown in FIG. 4, the scratch direction in the scratch prediction frame image is calculated by using a principal component analysis method, and an average value point and a starting point are calculated according to the scratch direction and are used as region growth seed points;

specifically, S301: binarizing the scratch prediction frame image, and setting the pixel of the binarized scratch as X ═ X⁽¹⁾,x⁽²⁾,...,x⁽ⁿ⁾) Each pixel point is represented according to its 2-dimensional coordinates as

z⁽ⁱ⁾＝W^Tx⁽ⁱ⁾,i＝1,2,...,n (5)

in order to preserve as much of the original data information as possible, the sum of the variances of the projection data is used

J(W)＝tr(W^TXX^TW+λ(W^TW-I)) (7)

wherein, λ is Lagrange factor, and W is derived and arranged to obtain:

XX^TW＝(-λ)W (8)

as can be seen from equation (8), W is the covariance matrix XX^TIs- λ is XX^TA matrix of eigenvalues of.

S302: computing sample covariance matrix XX^TTaking a corresponding characteristic vector when the maximum characteristic value is obtained, and recording the characteristic vector as a scratch direction;

s303: computing sample centers

S4: and carrying out 8-neighborhood region growth on pixel points in the scratch prediction frame image after the self-adaptive threshold segmentation, and restoring the growth point set to the position of the original image prediction frame to obtain a scratch segmentation result.

Specifically, S401: self-adaptive threshold segmentation is carried out on the scratch prediction frame image, the gray average value of a neighborhood window with a pixel as the center is used as a segmentation threshold value, so that the influence of uneven illumination and noise on the segmentation is reduced, and a target scratch pixel point set A is obtained;

If R is_gAnd (P) 1, pressing P into the seed point stack, recording the P into a growth point set S, and judging the growth criterion of all pixels in the stack until the stack is empty, wherein the growth point set S is the divided scratch pixel.

The application of the invention is as follows: experiments are carried out on a scratch data set and a welding seam X-ray image data set of an electronic commutator, a PyTorch deep learning framework is adopted, and a 3.40GHz Intel Core i7-6800K CPU and an 8GB memory GeForce GTX 1080Ti GPU are used under an Ubuntu 16.04Linux operating system. In order to be more suitable for practical application, data enhancement operations such as turning, rotating, zooming, clipping, translating, noise adding and the like are carried out, and the generalization capability and robustness of the model are improved.

Claims

1. A scratch detection method based on a deep convolutional neural network and image segmentation is characterized in that: the method comprises the following steps:

s1: according to the morphological characteristics of scratches, a multi-feature fusion module consisting of an upper sampling layer, a transverse connection layer and a feature fusion layer is designed, the multi-feature fusion module performs up-sampling on high-level features to obtain the same scale as that of low-level features, then performs fusion through the transverse connection layer, and respectively inputs 6 layers of feature layers used for prediction in a single-excitation multi-box detector SSD network into the multi-feature fusion module;

2. The scratch detection method based on the deep convolutional neural network and the image segmentation as claimed in claim 1, wherein: the specific operation steps of step S1 include:

3. The scratch detection method based on the deep convolutional neural network and the image segmentation as claimed in claim 1, wherein: the step S2 includes the following steps:

wherein, in the class loss function,

Taking 1 indicates a match, taking 0 indicates a mismatch,

representing the ith prior box and the p-th feature,

representing the frame of the jth scratch real object and the v-th prior frameDeviation from one another, Smooth_L1For the L1 regularization function, c_x,c_yW, h are the center coordinate, width and height of the prior frame, respectively;

s203: and clipping the input image to obtain a scratch prediction frame image.

4. The scratch detection method based on the deep convolutional neural network and the image segmentation as claimed in claim 1, wherein: the step S3 includes the following steps:

z⁽ⁱ⁾＝W^Tx⁽ⁱ⁾,i＝1,2,...,n (5)

J(W)＝tr(W^TXX^TW+λ(W^TW-I)) (7)

wherein λ is lagrangian factor, and the derivation of W is:

XX^TW＝(-λ)W (8)

w is the covariance matrix XX^TIs- λ is XX^TA matrix of eigenvalues of;

s303: computing sample centers

5. The scratch detection method based on the deep convolutional neural network and the image segmentation as claimed in claim 1, wherein: the step S4 includes the following steps: