CN114627106A

CN114627106A - Weld defect detection method based on Cascade Mask R-CNN model

Info

Publication number: CN114627106A
Application number: CN202210349016.8A
Authority: CN
Inventors: 梁丽红; 陈赡舒; 郭文明; 代淮北
Original assignee: China Special Equipment Inspection and Research Institute
Current assignee: China Special Equipment Inspection and Research Institute
Priority date: 2022-04-01
Filing date: 2022-04-01
Publication date: 2022-06-14

Abstract

A welding seam defect detection method based on the Cascade Mask R-CNN model, firstly obtains digital images of weld seam ray inspection including five kinds of defects, including round defects, strip defects, cracks, incomplete penetration and incomplete fusion, and their corresponding annotation files, And divided into training set, validation set and test set; then use the training set after image preprocessing to train the weld defect detection model built and optimized based on the Cascade Mask R‑CNN model, and use the validation set to obtain the optimal weight file The test set is then tested to evaluate the performance of the weld defect detection model. The invention improves the problem of low accuracy of the target detection model based on a single threshold value and the overfitting problem caused by directly increasing the threshold value, and improves the detection accuracy of welding seam defects.

Description

A Weld Defect Detection Method Based on Cascade Mask R-CNN Model

技术领域technical field

本发明涉及目标检测领域，尤其涉及一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法。The invention relates to the field of target detection, in particular to a welding seam defect detection method based on a Cascade Mask R-CNN model.

背景技术Background technique

焊接是工业化生产与制造过程中非常重要的技术之一，为了保证焊接质量则需要对焊缝进行无损检测，其中射线检测是常用的无损检测技术之一。目前射线检测的缺陷定性和定位主要依赖人工评定，即人工评片。人工评片受评片人员的专业水平和自身状况等主观因素影响较大且效率低，无法满足现代工业的自动化、智慧化检测需求。Welding is one of the most important technologies in the industrial production and manufacturing process. In order to ensure the welding quality, it is necessary to carry out non-destructive testing of welds. Among them, radiographic testing is one of the commonly used non-destructive testing techniques. At present, the characterization and positioning of defects in radiographic inspection mainly rely on manual evaluation, that is, manual evaluation of films. Manual film evaluation is greatly affected by subjective factors such as the professional level of the film reviewers and their own conditions, and the efficiency is low, which cannot meet the needs of automated and intelligent detection in modern industries.

近年来深度学习的发展不仅突破了很多难以解决的视觉难题，提升了对于图像认知的水平，更是加速了目标检测领域相关技术的进步，将深度学习中的图像特征自动学习方法应用于工业产品焊缝缺陷检测中成为了主流的研究方向。基于深度学习的目标检测模型主要分为one-stage和two-stage两类，one-stage目标检测模型使用卷积神经网络直接预测目标的类别和位置，实现分类和回归一步到位，该类模型的特点是速度快，但是精度较低；two-stage目标检测模型则是先生成可能包含目标的建议候选框，再通过预测网络进行分类和回归，该类模型的特点是速度慢，精度较高。除此之外，常见的目标检测模型在计算建议候选框与真实标注框的交并比(IoU)时，通过对比交并比(IoU)与设定的单一阈值将建议候选框分为正负样本，通常正样本会远大于负样本，在测试阶段会对正负样本进行采样使两者比例满足某个比值，但在测试阶段由于无真实标注框的对比，导致建议候选框的质量偏低，检测精度较低；因此需要一种采用级联结构的模型来突破单一阈值下焊缝缺陷检测精度的瓶颈，使得模型在每一次回归后样本都更能靠近焊缝缺陷的真实位置，适应不同建议候选框的分布。In recent years, the development of deep learning has not only broken through many difficult visual problems, improved the level of image cognition, but also accelerated the progress of related technologies in the field of target detection. The automatic learning method of image features in deep learning has been applied to industrial Product weld defect detection has become the mainstream research direction. The target detection model based on deep learning is mainly divided into two categories: one-stage and two-stage. The one-stage target detection model uses a convolutional neural network to directly predict the category and position of the target, and realizes classification and regression in one step. It is characterized by fast speed but low accuracy; the two-stage target detection model first generates suggested candidate boxes that may contain targets, and then performs classification and regression through the prediction network. This type of model is characterized by slow speed and high accuracy. In addition, when calculating the intersection-over-union ratio (IoU) between the proposed candidate frame and the ground-truth frame, the common target detection model divides the proposed candidate frame into positive and negative by comparing the intersection-over-union ratio (IoU) and a single set threshold. Samples, usually positive samples are much larger than negative samples. In the testing phase, the positive and negative samples will be sampled to make the ratio between the two meet a certain ratio. However, in the testing phase, due to the lack of comparison of the real labeled boxes, the quality of the proposed candidate frame is low. , the detection accuracy is low; therefore, a model using a cascade structure is needed to break through the bottleneck of weld defect detection accuracy under a single threshold, so that the model can be closer to the real position of the weld defect after each regression, adapting to different The distribution of proposed candidate boxes.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法，从而解决现有技术中存在的前述问题。The purpose of the present invention is to provide a welding seam defect detection method based on the Cascade Mask R-CNN model, so as to solve the aforementioned problems existing in the prior art.

为了实现上述目的，本发明采用的技术方案如下：In order to achieve the above object, the technical scheme adopted in the present invention is as follows:

一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法，包括以下步骤：A welding seam defect detection method based on Cascade Mask R-CNN model, comprising the following steps:

S1、获取含有圆缺、条缺、裂纹、未焊透和未熔合五种缺陷的焊缝射线检测数字图像及其对应的标注文件；将所述焊缝射线检测数字图像和对应的所述标注文件划分为训练集、验证集和测试集；S1. Acquire a digital image of the weld seam radiographic inspection containing five kinds of defects, such as round defect, strip defect, crack, lack of penetration, and lack of fusion, and its corresponding annotation file; The file is divided into training set, validation set and test set;

S2、对焊缝射线检测数字图像进行图像预处理，得到增强且统一的图像；S2. Perform image preprocessing on the digital image of the weld seam radiographic inspection to obtain an enhanced and unified image;

S3、基于Cascade Mask R-CNN模型搭建并优化焊缝缺陷检测模型，所述焊缝缺陷检测模型包括特征提取的卷积神经网络和分类与回归的预测网络；S3. Build and optimize a weld defect detection model based on the Cascade Mask R-CNN model. The weld defect detection model includes a convolutional neural network for feature extraction and a prediction network for classification and regression;

S4、利用所述训练集对所述焊缝缺陷检测模型进行训练，并采用所述验证集对每轮训练得到的权重文件进行验证，获取在验证集上表现最优的权重文件；包括如下步骤：S4. Use the training set to train the weld defect detection model, and use the verification set to verify the weight file obtained in each round of training, and obtain the weight file with the best performance on the verification set; including the following steps :

S41、通过所述卷积神经网络对所述焊缝射线检测数字图像进行特征提取形成特征图；S41. Perform feature extraction on the digital image of the weld ray detection through the convolutional neural network to form a feature map;

S42、利用多尺度检测算法FPN改进RPN为所述特征图生成建议候选框；S42, using the multi-scale detection algorithm FPN to improve the RPN to generate a suggested candidate frame for the feature map;

S43、将所述建议候选框映射到所述特征图中得到所对应的特征矩阵，将所述特征矩阵通过ROI Align统一缩放到指定大小；S43, the proposed candidate frame is mapped to the feature map to obtain a corresponding feature matrix, and the feature matrix is uniformly scaled to a specified size by ROI Align;

S44、将所述建议候选框通过由目标分类器和边界框回归器组成的三级级联检测器得到三个设定阈值下对应的缺陷类别和边界框回归参数；最后通过非极大值抑制以及滤除低概率目标得到最终的检测结果；S44, passing the proposed candidate frame through a three-stage cascade detector composed of a target classifier and a bounding box regressor to obtain corresponding defect categories and bounding box regression parameters under three set thresholds; And filter out low-probability targets to get the final detection result;

S45、设置所述焊缝缺陷检测模型的训练参数，采用所述训练集通过所述步骤S41-S44对所述焊缝缺陷检测模型进行训练，并采用所述验证集对每一轮训练得到的权重文件进行验证，获取在所述验证集上表现最优的权重文件；S45. Set the training parameters of the weld defect detection model, use the training set to train the weld defect detection model through the steps S41-S44, and use the validation set to perform training on the weld defect detection model obtained in each round of training. The weight file is verified, and the weight file that performs best on the verification set is obtained;

S5、利用所述测试集对应用步骤S45得出的在验证集上表现最优的所述权重文件进行测试，以此评估该焊缝缺陷检测模型的性能。S5. Use the test set to test the weight file obtained in step S45 with the best performance on the verification set, so as to evaluate the performance of the weld defect detection model.

优选的，步骤S2中进行图像预处理的方法包括图像增强与图像去噪，所述图像增强采用AHE算法对所述焊缝射线检测数字图像进行图像增强，对所述焊缝射线检测数字图像进行细节锐化，凸显缺陷特征，所述图像增强的计算公式如下：Preferably, the method for image preprocessing in step S2 includes image enhancement and image denoising, and the image enhancement uses AHE algorithm to perform image enhancement on the digital image of the weld seam ray detection, and the digital image of the weld seam ray detection is image enhanced. The details are sharpened and the defect features are highlighted. The calculation formula of the image enhancement is as follows:

上式中：y_i,j表示变换前的中心像素，Y_i,j表示变换后的中心像素，m_i,j表示以y_i,j为中心点的局部区域的灰度均值，T表示对该点的累积分布变换函数，k表示自适应函数，由局部区域的像素征计算得出；In the above formula: y _i,j represents the center pixel before transformation, Y _i,j represents the center pixel after transformation, m _i,j represents the gray mean value of the local area with y _i,j as the center point, T represents the pair The cumulative distribution transformation function of the point, k represents the adaptive function, which is calculated from the pixel characteristics of the local area;

所述图像去噪采用DMB算法对焊缝射线检测数字图像进行图像去噪，降低图像噪声，对缺陷特征保留并增强。In the image denoising, the DMB algorithm is used to denoise the digital image of the weld seam ray inspection, so as to reduce the image noise and preserve and enhance the defect features.

优选的，步骤S41中的所述卷积神经网络为ResNeXt-101,所述卷积神经网络包括卷积层、池化层和激活层，卷积层从输入的所述图像中提取特征，生成特征图；池化层用于去除冗余信息，减少参数量，扩大接受域；激活层增加输出的非线性，利用激活函数对输出层的结果进行卷积，得到非线性映射。Preferably, the convolutional neural network in step S41 is ResNeXt-101, the convolutional neural network includes a convolutional layer, a pooling layer and an activation layer, and the convolutional layer extracts features from the input image to generate Feature map; the pooling layer is used to remove redundant information, reduce the amount of parameters, and expand the receptive field; the activation layer increases the nonlinearity of the output, and uses the activation function to convolve the results of the output layer to obtain a nonlinear mapping.

优选的，步骤S42中利用FPN改进RPN为特征图生成建议候选框，包括以下步骤：Preferably, in step S42, using FPN to improve RPN to generate suggested candidate frames for the feature map, including the following steps:

S421：在所述卷积神经网络的前向过程中前馈ResNeXt-101的一部分，记ResNeXt-101每级最后一个残差块的输出为{C1,C2,C3,C4,C5}，首先是自底向上的过程，每一级往上采用设定步长的降采样，将不改变特征图大小的层归为一个stage以此构成特征金字塔；S421: Feed forward a part of ResNeXt-101 in the forward process of the convolutional neural network, and denote the output of the last residual block of each stage of ResNeXt-101 as {C1, C2, C3, C4, C5}, the first is In the bottom-up process, downsampling with a set step size is used for each level upwards, and the layers that do not change the size of the feature map are classified as a stage to form a feature pyramid;

S422：自顶向下采用上采样的方式将顶层的小特征图放大到与上一个stage的特征图一样的大小；S422: Upsampling is used from top to bottom to enlarge the small feature map of the top layer to the same size as the feature map of the previous stage;

S423：所述横向连接将上采样结果和自底向上生成的相同大小的特征图进行融合，通过3×3的卷积核对每个融合结果进行卷积得到最终的特征层为P＝{P2,P3,P4,P5}。S423: The horizontal connection fuses the upsampling result and the feature map of the same size generated from the bottom to the top, and convolves each fusion result through a 3×3 convolution kernel to obtain the final feature layer P={P2, P3,P4,P5}.

优选的，所述步骤S421还包括：在ResNeXt-101的{C3,C4,C5}中引入高效注意力模块，首先采用1×1的卷积W_k以及softmax获得注意力权重，通过注意力池化获取全局上下文特征，其次通过一个1×1的卷积W_v1和层归一化后，由ReLU函数激活，经过一个1×1的卷积W_v2，得到每个通道的重要程度，最后利用加法将全局上下文特征聚合到每个位置的特征上，形成长距离依赖关系；所述注意力模块计算公式如下：Preferably, the step S421 further includes: introducing an efficient attention module into {C3, C4, C5} of ResNeXt-101, firstly using a 1×1 convolution W _k and softmax to obtain the attention weight, and using the attention pool The global context features are obtained through a 1×1 convolution W _v1 and layer normalization, activated by the ReLU function, and after a 1×1 convolution W _v2 , the importance of each channel is obtained, and finally the use of The addition aggregates the global context features to the features of each position to form long-distance dependencies; the attention module calculation formula is as follows:

上式中：z_i表示注意力模块的输入，Z_i表示注意力模块的输出，N_p表示为特征映射中的位置数，

表示全局注意力池化的权重，LN代表层归一化，W_v2ReLU(LN(W_v1(·)))表示计算每个通道的重要程度。In the above formula: z _i represents the input of the attention module, Z _i represents the output of the attention module, N _p represents the number of positions in the feature map,

represents the weight of global attention pooling, LN represents layer normalization, and W _v2 ReLU(LN(W _v1 ( ))) represents the importance of calculating each channel.

优选的，所述步骤S423之后还包括：通过K-means聚类算法统计标注文件中矩形标注框的面积和长宽比，设置五种面积{32²,64²,128²,256²,512²}的anchor分别对应到{P2,P3,P4,P5,P6}五个特征层上，其中P6特征层通过P5特征层降采样得到，每种面积的anchor设置七种长宽比{1:10,1:5,1:2,1:1,2:1,5:1,10:1}，生成的anchor在特征层上滑动遍历，进而生成建议候选框。Preferably, after the step S423, the method further includes: using the K-means clustering algorithm to count the area and the aspect ratio of the rectangular annotation frame in the annotation file, and set five types of areas {32 ² , 64 ² , 128 ² , 256 ² , 512 The anchors of ² } correspond to the five feature layers of {P2, P3, P4, P5, P6}, of which the P6 feature layer is obtained by downsampling the P5 feature layer, and the anchor of each area is set to seven aspect ratios {1: 10, 1: 5, 1: 2, 1: 1, 2: 1, 5: 1, 10: 1}, the generated anchor slides and traverses on the feature layer, and then generates the proposed candidate frame.

优选的，步骤S44中的所述三级级联检测器中的边界框回归器被定义为一个级联回归问题，级联回归通过重采样改变不同阶段所要处理的样本分布，所述边界框回归器定义如下：Preferably, the bounding box regressor in the three-stage cascade detector in step S44 is defined as a cascade regression problem, and the cascade regression changes the distribution of samples to be processed in different stages through resampling, and the bounding box regression The device is defined as follows:

上式中：x表示子图像块，b表示样本分布，f表示边界框回归器，f₁、f₂、f₃设定的阈值分别为0.4、0.5、0.6，f₁的输出作为f₂的输入，f₂的输出作为f₃的输入，{f₁,f₂,f₃}针对不同阶段的重采样分布进行优化，同时作用于训练和测试阶段；In the above formula: x represents the sub-image block, b represents the sample distribution, f represents the bounding box regressor, the thresholds set by f ₁ , _f ₂ , and _f ₃ are 0.4, 0.5, and 0.6, respectively. Input, the output of f ₂ is used as the input of f ₃ , {f ₁ , f ₂ , f ₃ } is optimized for the resampling distribution of different stages, and acts on the training and testing stages at the same time;

定义训练模型的总损失函数L包括两部分：边界框回归损失、目标分类损失，所述总损失函数计算公式如下：The total loss function L that defines the training model includes two parts: bounding box regression loss and target classification loss. The calculation formula of the total loss function is as follows:

L(x^t,g)＝L_cls(h_t(x^t),X^t)+μ[X^t≥1]L_loc(f_t(x^t,b^t),g)L(x ^t ,g)=L _cls (h _t (x ^t ),X ^t )+μ[X ^t ≥1]L _loc (f _t (x ^t ,b ^t ),g)

上式中：L_cls表示目标分类的损失函数，L_loc表示边界框回归的损失函数，{b^t}表示不同训练阶段t的样本分布且有b^t＝f_t-1(x^t-1,b^t-1)，h_t表示目标分类器，f_t表示边界框回归器，g表示对应x^t的真实目标框参数，μ表示折中系数，X^t表示x^t对应的标签，[·]表示一个指示函数。In the above formula: L _cls represents the loss function of target classification, L _loc represents the loss function of bounding box regression, {b ^t } represents the sample distribution of different training stages t and has b ^t =f _t-1 (x ^t-1 , b ^t-1 ), h _t represents the target classifier, f _t represents the bounding box regressor, g represents the real target box parameter corresponding to x ^t , μ represents the compromise coefficient, X ^t represents the label corresponding to x ^t , [ ] Represents an indicator function.

优选的，步骤S45中的所述训练参数设置包括学习率、动量、权重衰减、batch_size的值和训练总轮数，利用ImageNet数据集训练权重作为所述焊缝缺陷检测模型的预训练权重，所述焊缝射线检测数字图像进入所述焊缝缺陷检测模型前统一缩放尺寸，并使用随机梯度下降法更新所述焊缝缺陷检测模型的参数。Preferably, the training parameter settings in step S45 include learning rate, momentum, weight decay, the value of batch_size and the total number of training rounds, and the ImageNet dataset training weight is used as the pre-training weight of the weld defect detection model, so The size of the digital image of the welding seam ray detection is uniformly scaled before entering the welding seam defect detection model, and the parameters of the welding seam defect detection model are updated by using the stochastic gradient descent method.

本发明的有益效果是：本发明公开了一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法，通过引入三级级联结构，改善了基于单一阈值的目标检测模型精度较低问题以及直接提高阈值而产生的过拟合问题；通过多尺度检测算法FPN对RPN进行改进，融合了具有高分辨率的低层特征信息与具有高语义的高层特征信息；通过在Cascade Mask R-CNN模型中引入高效注意力模块，在简化Non-Local注意力机制计算量的基础上，维持着相近的精度结果，并通过特征图的各个位置聚合相同的特征来强化原有特征。本发明相较于现有的two-stage目标检测模型以及原始Cascade Mask R-CNN模型，对焊缝缺陷的检测精度均有明显提升。The beneficial effects of the present invention are as follows: the present invention discloses a welding seam defect detection method based on the Cascade Mask R-CNN model. By introducing a three-level cascade structure, the problem of low accuracy of the target detection model based on a single threshold and the direct The over-fitting problem caused by increasing the threshold; the RPN is improved by the multi-scale detection algorithm FPN, which combines low-level feature information with high resolution and high-level feature information with high semantics; by introducing Cascade Mask R-CNN model The efficient attention module maintains similar accuracy results on the basis of simplifying the computation of the Non-Local attention mechanism, and strengthens the original features by aggregating the same features in each position of the feature map. Compared with the existing two-stage target detection model and the original Cascade Mask R-CNN model, the invention significantly improves the detection accuracy of weld defects.

附图说明Description of drawings

图1是基于Cascade Mask R-CNN模型的焊缝缺陷检测方法流程图；Figure 1 is a flowchart of the weld defect detection method based on the Cascade Mask R-CNN model;

图2是收集的焊缝射线检测数字图像缺陷部分；Figure 2 is the digital image defect part of the collected weld radiographic inspection;

图3是利用FPN改进RPN的结构示意图；Fig. 3 is the structural representation that utilizes FPN to improve RPN;

图4是高效注意力模块示意图；Figure 4 is a schematic diagram of an efficient attention module;

图5是Cascade Mask R-CNN模型的结构示意图；Figure 5 is a schematic structural diagram of the Cascade Mask R-CNN model;

图6是训练过程中损失函数值与迭代次数的关联曲线。Figure 6 is the correlation curve between the loss function value and the number of iterations in the training process.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白，以下结合附图，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施方式仅仅用以解释本发明，并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法，首先将包含圆缺、条缺、裂纹、未焊透和未熔合五种缺陷的焊缝射线检测数字图像及其对应的标注文件划分为训练集、验证集和测试集；并利用经图像预处理后的训练集对基于Cascade Mask R-CNN模型搭建并优化的焊缝缺陷检测模型进行训练，采用验证集获取最优权重文件后对测试集进行测试以此评估所述焊缝缺陷检测模型的性能。基于Cascade Mask R-CNN模型进行焊缝射线检测的方法如图1所示，具体包括以下步骤：A welding seam defect detection method based on the Cascade Mask R-CNN model. First, the digital images of weld ray inspection and their corresponding annotation files containing five kinds of defects, including round defects, strip defects, cracks, lack of penetration and lack of fusion, are divided into It is a training set, a validation set and a test set; and the training set after image preprocessing is used to train the weld defect detection model built and optimized based on the Cascade Mask R-CNN model, and the validation set is used to obtain the optimal weight file. The test set was tested to evaluate the performance of the weld defect detection model. The method of welding seam ray detection based on Cascade Mask R-CNN model is shown in Figure 1, which includes the following steps:

实施例中共计选用2600张金属焊缝射线检测数字图像，焊缝图像的像素大小为400×3050，其缺陷类型有5种，分别为圆缺、条缺、裂纹、未焊透和未熔合，所述焊缝射线检测数字图像中各类缺陷如图2所示，按照9：1的比例将所有焊缝射线检测数字图像及对应的标注文件随机划分为训练集和测试集，再从训练集中随机划分出10％的数据作为验证集。In the example, a total of 2600 digital images of metal welds were selected for radiographic inspection. The pixel size of the weld images was 400 × 3050, and there were 5 types of defects, namely round defects, strip defects, cracks, incomplete penetration and incomplete fusion. The various types of defects in the digital image of the weld seam ray detection are shown in Figure 2. All digital images of the weld seam ray detection and the corresponding annotation files are randomly divided into training set and test set according to the ratio of 9:1, and then from the training set Randomly divide 10% of the data as the validation set.

S2、对焊缝射线检测数字图像进行图像预处理，得到增强且统一的图像；图像预处理的方法包括图像增强与图像去噪，所述图像增强采用AHE算法对所述焊缝射线检测数字图像进行图像增强，对所述焊缝射线检测数字图像进行细节锐化，凸显缺陷特征，所述图像增强的计算公式如下：S2. Perform image preprocessing on the digital image of the welding seam ray detection to obtain an enhanced and unified image; the image preprocessing method includes image enhancement and image denoising, and the image enhancement adopts the AHE algorithm to perform the digital image of the welding seam ray detection. Image enhancement is performed to sharpen the details of the digital image of the weld ray detection to highlight the defect features. The calculation formula of the image enhancement is as follows:

S3、基于Cascade Mask R-CNN原始模型搭建并优化焊缝缺陷检测模型，所述焊缝缺陷检测模型包括特征提取的卷积神经网络和分类与回归的预测网络；S3. Build and optimize a weld defect detection model based on the original Cascade Mask R-CNN model. The weld defect detection model includes a convolutional neural network for feature extraction and a prediction network for classification and regression;

所述卷积神经网络为ResNeXt-101,所述卷积神经网络包括卷积层、池化层和激活层，卷积层从输入的所述图像中提取特征，生成特征图；池化层用于去除冗余信息，减少参数量，扩大接受域；激活层增加输出的非线性，利用激活函数对输出层的结果进行卷积，得到非线性映射。The convolutional neural network is ResNeXt-101, and the convolutional neural network includes a convolutional layer, a pooling layer and an activation layer, and the convolutional layer extracts features from the inputted image to generate a feature map; the pooling layer uses To remove redundant information, reduce the amount of parameters, and expand the receptive field; the activation layer increases the nonlinearity of the output, and uses the activation function to convolve the results of the output layer to obtain a nonlinear mapping.

所述ResNeXt-101的网络结构如表1所示：The network structure of the ResNeXt-101 is shown in Table 1:

表1：ResNeXt-101的网络结构Table 1: Network structure of ResNeXt-101

S42、利用多尺度检测算法FPN改进RPN为所述特征图生成建议候选框，FPN中的横向连接融合具有高分辨率的低层特征信息与具有高语义的高层特征信息；S42, using the multi-scale detection algorithm FPN to improve the RPN to generate a suggested candidate frame for the feature map, and the horizontal connection in the FPN fuses low-level feature information with high resolution and high-level feature information with high semantics;

实施例利用多尺度检测算法FPN改进RPN为特征图生成建议候选框，包括以下步骤：The embodiment utilizes the multi-scale detection algorithm FPN to improve the RPN to generate a suggested candidate frame for the feature map, including the following steps:

S421：在网络的前向过程中前馈ResNeXt-101的一部分，记ResNeXt-101每级最后一个残差块的输出为{C1,C2,C3,C4,C5}，如图3所示，首先是自底向上的过程，每一级往上采用设定步长的降采样，将不改变特征图大小的层归为一个stage以此构成特征金字塔；S421: Feed forward a part of ResNeXt-101 in the forward process of the network, and record the output of the last residual block of each stage of ResNeXt-101 as {C1, C2, C3, C4, C5}, as shown in Figure 3, first It is a bottom-up process. Each level is up-sampling with a set step size, and the layers that do not change the size of the feature map are grouped into a stage to form a feature pyramid;

为了强化原有特征，引入高效注意力模块到ResNeXt-101的{C3,C4,C5}中，如图4所示，首先采用1×1的卷积W_k以及softmax获得注意力权重，通过注意力池化获取全局上下文特征，其次通过一个1×1的卷积W_v1和层归一化后，由ReLU函数激活，经过一个1×1的卷积W_v2，得到每个通道的重要程度，最后利用加法将全局上下文特征聚合到每个位置的特征上，形成长距离依赖关系；所述注意力模块计算公式如下：In order to strengthen the original features, an efficient attention module is introduced into {C3, C4, C5} of ResNeXt-101, as shown in Figure 4, firstly, the 1×1 convolution W _k and softmax are used to obtain the attention weight, and the attention Force pooling obtains global contextual features, followed by a 1×1 convolution W _v1 and layer normalization, activated by the ReLU function, and a 1×1 convolution W _v2 to get the importance of each channel, Finally, the global context features are aggregated to the features of each position by addition to form long-distance dependencies; the calculation formula of the attention module is as follows:

得到经过特征融合的特征层P之后，通过K-means聚类算法统计标注文件中矩形标注框的面积和长宽比，设置五种面积{32²,64²,128²,256²,512²}的anchor分别对应到{P2,P3,P4,P5,P6}五个特征层上，其中P6特征层通过P5特征层降采样得到，每种面积的anchor设置七种长宽比{1:10,1:5,1:2,1:1,2:1,5:1,10:1}，生成的anchor在特征层上滑动遍历，进而生成建议候选框。After obtaining the feature layer P that has undergone feature fusion, use the K-means clustering algorithm to count the area and aspect ratio of the rectangular annotation box in the annotation file, and set five areas {32 ² ,64 ² ,128 ² ,256 ² ,512 ² The anchors of } correspond to the five feature layers of {P2, P3, P4, P5, P6}, of which the P6 feature layer is obtained by downsampling the P5 feature layer, and the anchor of each area is set to seven aspect ratios {1:10 ,1:5,1:2,1:1,2:1,5:1,10:1}, the generated anchor slides and traverses on the feature layer, and then generates the proposed candidate frame.

S43：将所述建议候选框映射到所述特征图中得到所对应的特征矩阵，将所述特征矩阵通过ROI Align统一缩放到指定大小；S43: Map the proposed candidate frame to the feature map to obtain a corresponding feature matrix, and uniformly scale the feature matrix to a specified size through ROI Align;

S44：将所述建议候选框通过由目标分类器和边界框回归器组成的三级级联检测器，如图5所示，得到三个设定阈值下对应的缺陷类别和边界框回归参数；最后通过非极大值抑制以及滤除低概率目标得到最终的检测结果；S44: Pass the proposed candidate frame through a three-stage cascade detector composed of a target classifier and a bounding box regressor, as shown in Figure 5, to obtain corresponding defect categories and bounding box regression parameters under three set thresholds; Finally, the final detection result is obtained by non-maximum suppression and filtering out low-probability targets;

所述三级级联检测器中的边界框回归器被定义为一个级联回归问题，级联回归通过重采样改变不同阶段所要处理的样本分布，所述边界框回归器定义如下：The bounding box regressor in the three-stage cascade detector is defined as a cascade regression problem. The cascade regression changes the distribution of samples to be processed in different stages through resampling. The bounding box regressor is defined as follows:

上式中：x表示子图像块，b表示样本分布，f表示边界框回归器，f₁、f₂、f₃设定的阈值分别为0.4、0.5、0.6，f₁的输出作为f₂的输入，f₂的输出作为f₃的输入，{f₁,f₂,f₃}针对不同阶段的重采样分布进行优化，同时作用于训练和测试阶段。In the above formula: x represents the sub-image block, b represents the sample distribution, f represents the bounding box regressor, the thresholds set by f ₁ , _f ₂ , and _f ₃ are 0.4, 0.5, and 0.6, respectively. The input, the output of f ₂ is used as the input of f ₃ , {f ₁ , f ₂ , f ₃ } is optimized for the resampling distribution of different stages, acting on both training and testing stages.

S45：设置所述焊缝缺陷检测模型的训练参数：初始学习率设置为0.00125，动量设置为0.9，权重衰减设置为1e-4，batch_size的值设置为1，训练总轮数设置为40；利用ImageNet数据集训练权重作为所述焊缝缺陷检测模型的预训练权重，焊缝射线检测数字图像进入焊缝缺陷检测模型前统一缩放尺寸至350×2600，并使用随机梯度下降法更新所述焊缝缺陷检测模型的参数；采用所述训练集通过所述步骤S41-S44对所述焊缝缺陷检测模型进行训练，并采用所述验证集对每一轮训练得到的权重文件进行验证，获取在所述验证集上表现最优的权重文件；所述焊缝缺陷检测模型训练过程中损失函数值与迭代次数的关联曲线如图6所示。S45: Set the training parameters of the weld defect detection model: the initial learning rate is set to 0.00125, the momentum is set to 0.9, the weight decay is set to 1e-4, the value of batch_size is set to 1, and the total number of training rounds is set to 40; The ImageNet dataset training weight is used as the pre-training weight of the weld defect detection model. The digital image of weld ray detection is uniformly scaled to 350×2600 before entering the weld defect detection model, and the stochastic gradient descent method is used to update the weld. parameters of the defect detection model; use the training set to train the weld defect detection model through the steps S41-S44, and use the verification set to verify the weight files obtained in each round of training, and obtain the Figure 6 shows the correlation curve between the loss function value and the number of iterations during the training process of the weld defect detection model.

S5、利用所述测试集对应用步骤S45得出的在验证集上表现最优的权重文件进行测试，以此评估该焊缝缺陷检测模型的性能。S5. Use the test set to test the weight file with the best performance on the verification set obtained by applying step S45, so as to evaluate the performance of the weld defect detection model.

本实施例中利用训练集和测试集对所述焊缝缺陷检测模型与其他two-stage目标检测模型分别进行训练与测试，并引入每一类缺陷的平均精度(AP)以及各类缺陷AP的平均值(mAP)作为本实施例的评价指标。实验结果及对比如表2所示：In this embodiment, the training set and the test set are used to train and test the weld defect detection model and other two-stage target detection models respectively, and introduce the average precision (AP) of each type of defect and the AP of each type of defect. The average value (mAP) was used as the evaluation index in this example. The experimental results and comparisons are shown in Table 2:

表2：模型实验结果对比(IoU＝0.5)Table 2: Comparison of model experimental results (IoU=0.5)

分析上述实验结果可以得出：相比传统的two-stage目标检测模型，本实施例所述的焊缝缺陷检测模型关于裂纹、条缺、圆缺、未焊透和未熔合五种缺陷检测的AP都有明显提升，最终的mAP也高于其他模型。From the analysis of the above experimental results, it can be concluded that compared with the traditional two-stage target detection model, the weld defect detection model described in this embodiment has the advantages of five defects detection of crack, strip defect, round defect, incomplete penetration and lack of fusion. The AP has been significantly improved, and the final mAP is also higher than other models.

通过采用本发明公开的上述技术方案，得到了如下有益的效果：By adopting the above-mentioned technical scheme disclosed by the present invention, the following beneficial effects are obtained:

本发明公开了一种基于Cascade Mask R-CNN模型的焊缝缺陷检测方法，通过引入三级级联结构，改善了基于单一阈值的目标检测模型精度较低问题以及直接提高阈值而产生的过拟合问题；通过多尺度检测算法FPN对RPN进行改进，融合了具有高分辨率的低层特征信息与具有高语义的高层特征信息；通过在Cascade Mask R-CNN模型中引入高效注意力模块，在简化Non-Local注意力机制计算量的基础上，维持着相近的精度结果，并通过特征图的各个位置聚合相同的特征来强化原有特征。本发明相较于现有的two-stage目标检测模型以及原始Cascade Mask R-CNN模型，对焊缝缺陷的检测精度均有明显提升。The invention discloses a welding seam defect detection method based on the Cascade Mask R-CNN model. By introducing a three-level cascade structure, the problem of low accuracy of a target detection model based on a single threshold value and the overfitting caused by directly increasing the threshold value are improved The RPN is improved by the multi-scale detection algorithm FPN, which combines the low-level feature information with high resolution and the high-level feature information with high semantics; by introducing an efficient attention module into the Cascade Mask R-CNN model, it is easier to simplify Based on the calculation amount of the Non-Local attention mechanism, it maintains similar accuracy results, and strengthens the original features by aggregating the same features in each position of the feature map. Compared with the existing two-stage target detection model and the original Cascade Mask R-CNN model, the invention significantly improves the detection accuracy of weld defects.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can be made. It should be regarded as the protection scope of the present invention.

Claims

1. A weld defect detection method based on a Cascade Mask R-CNN model is characterized by comprising the following steps:

s1, acquiring a welding seam radiographic inspection digital image containing five defects of round defect, strip defect, crack, incomplete penetration and incomplete fusion and a corresponding label file thereof; dividing the welding seam ray detection digital image and the corresponding label file into a training set, a verification set and a test set;

s2, carrying out image preprocessing on the welding line ray detection digital image to obtain an enhanced and unified image;

s3, building and optimizing a weld defect detection model based on a Cascade Mask R-CNN model, wherein the weld defect detection model comprises a convolutional neural network for feature extraction and a prediction network for classification and regression;

s4, training the weld defect detection model by using the training set, and verifying the weight file obtained by each training by using the verification set to obtain the weight file which is optimal to be expressed on the verification set; the method comprises the following steps:

s41, performing feature extraction on the welding line ray detection digital image through the convolutional neural network to form a feature map;

s42, improving the RPN by utilizing a multi-scale detection algorithm (FPN) to generate a suggested candidate box for the feature map;

s43, mapping the suggested candidate box to the feature diagram to obtain a corresponding feature matrix, and uniformly scaling the feature matrix to a specified size through ROIAlign;

s44, obtaining corresponding defect categories and boundary box regression parameters under three set thresholds through a three-level cascade detector consisting of a target classifier and a boundary box regressor by the suggested candidate box; finally, a final detection result is obtained by non-maximum value inhibition and low probability target filtering;

s45, setting training parameters of the weld defect detection model, training the weld defect detection model through the steps S41-S44 by adopting the training set, verifying the weight file obtained by each training by adopting the verification set, and acquiring the weight file which is optimal to be expressed on the verification set;

and S5, testing the weight file which is obtained by applying the step S45 and has the optimal performance on the verification set by using the test set, so as to evaluate the performance of the weld defect detection model.

2. The Cascade Mask R-CNN model-based weld defect detection method of claim 1, wherein the image preprocessing method in step S2 comprises image enhancement and image denoising, wherein the image enhancement adopts an AHE algorithm to perform image enhancement on the weld ray detection digital image, the weld ray detection digital image is subjected to detail sharpening to highlight defect features, and the image enhancement has the following calculation formula:

in the above formula: y is_i,jRepresenting the central pixel before transformation, Y_i,jRepresenting the transformed center pixel, m_i,jIs expressed as y_i,jThe gray level mean value of a local area of a central point, T represents a cumulative distribution transformation function of the point, k represents an adaptive function and is obtained by pixel characterization calculation of the local area;

the image denoising adopts a DMB algorithm to perform image denoising on the welding line ray detection digital image, so that the image noise is reduced, and the defect characteristics are reserved and enhanced.

3. The Cascade Mask R-CNN model-based weld defect detection method according to claim 1, characterized in that the convolutional neural network in step S41 is ResNeXt-101, the convolutional neural network comprises a convolutional layer, a pooling layer and an activation layer, and the convolutional layer extracts features from the input image to generate a feature map; the pooling layer is used for removing redundant information, reducing the number of parameters and expanding an acceptance domain; the activation layer increases the output nonlinearity, and the result of the output layer is convolved by using the activation function to obtain nonlinear mapping.

4. The Cascade Mask R-CNN model-based weld defect detection method according to claim 1, wherein in step S42, a proposed candidate frame is generated for the feature map by using FPN to improve RPN, and the method comprises the following steps:

s421: feeding forward a part of ResNeXt-101 in the forward process of the convolutional neural network, and recording the output of the last residual block of each stage of ResNeXt-101 as { C1, C2, C3, C4 and C5}, firstly, performing a bottom-up process, and reducing the sampling of a set step length upwards at each stage to form a stage without changing the size of a feature map so as to form a feature pyramid;

s422: the small feature map of the top layer is enlarged to the same size as the feature map of the last stage from top to bottom in an up-sampling mode;

s423: the cross-concatenation fuses the upsampled results with feature maps of the same size generated from bottom to top, and convolves each fused result with a 3 × 3 convolution kernel to obtain the final feature level P ═ P2, P3, P4, P5 }.

5. The Cascade Mask R-CNN model-based weld defect detection method according to claim 4, wherein the step S421 further comprises: a high efficiency attention module was introduced in { C3, C4, C5} of ResNeXt-101, first using a 1 × 1 convolution W_kAnd softmax obtains attention weight, acquires global context features by attention pooling, and then by a 1 × 1 convolution W_v1After normalization of the sum layer, activation by the ReLU function, and a further 1 × 1 convolution W_v2Obtaining the importance degree of each channel, and finally adding the obtained data to the original dataThe local context features are aggregated to the features of each position to form a long-distance dependency relationship; the attention module calculation formula is as follows:

in the above formula: z is a radical of_iIndicating the input of the attention module, Z_iIndicating the output of the attention module, N_pExpressed as a number of locations in the feature map,

weights representing global attention pooling, LN stands for layer normalization, W_v2ReLU(LN(W_v1(.))) represents the degree of importance of computing each channel.

6. The Cascade Mask R-CNN model-based weld defect detection method according to claim 4, further comprising, after the step S423: the area and the length-width ratio of a rectangular marking box in the marking file are counted through a K-means clustering algorithm, and five areas {32 } are set²,64²,128²,256²,512²The anchors of the areas correspond to five feature layers of { P2, P3, P4, P5 and P6} respectively, wherein the P6 feature layer is obtained by downsampling the P5 feature layer, seven aspect ratios of {1:10,1:5,1:2,1:1,2:1,5:1 and 10:1} are set for the anchors of each area, and the generated anchors are traversed on the feature layers in a sliding mode to generate the suggested candidate frames.

7. The Cascade Mask R-CNN model-based weld defect detection method of claim 1, wherein a boundary box regressor in the three-level Cascade detector in the step S44 is defined as a Cascade regression problem, the Cascade regression changes the sample distribution to be processed in different stages through resampling, and the boundary box regressor is defined as follows:

in the above formula: x denotes the subimage block, b denotes the sample distribution, f denotes the bounding box regressor, f₁、f₂、f₃The set threshold values are 0.4, 0.5, 0.6, respectively, f₁As output of f₂Input of f₂As output of f₃Input of { f }₁,f₂,f₃Optimizing the resampling distribution of different stages, and simultaneously acting on training and testing stages;

the total loss function L defining the training model includes two parts: the total loss function calculation formula comprises a bounding box regression loss and a target classification loss, wherein the total loss function calculation formula comprises the following steps:

L(x^t,g)＝L_cls(h_t(x^t),X^t)+μ[X^t≥1]L_loc(f_t(x^t,b^t),g)

in the above formula: l is_clsLoss function, L, representing the classification of the object_locLoss function representing regression of bounding box, { b^tDenotes the distribution of samples for different training phases t and has b^t＝f_t-1(x^t-1,b^t-1)，h_tRepresenting object classifier, f_tRepresenting a bounding box regressor, g representing the correspondence x^tMu represents the compromise coefficient, X^tDenotes x^tCorresponding label, [ ·]An indicator function is represented.

8. The Cascade Mask R-CNN model-based weld defect detection method according to claim 1, wherein the training parameter settings in step S45 include learning rate, momentum, weight attenuation, value of batch _ size and total number of training rounds, the training weights are used as pre-training weights of the weld defect detection model, the weld ray detection digital image is uniformly scaled before entering the weld defect detection model, and parameters of the weld defect detection model are updated by using a random gradient descent method.