CN113496480A

CN113496480A - Method for detecting weld image defects

Info

Publication number: CN113496480A
Application number: CN202110524253.9A
Authority: CN
Inventors: 杜宇
Original assignee: Xi'an Digital Information Technology Co ltd
Current assignee: Xi'an Digital Information Technology Co ltd
Priority date: 2021-05-13
Filing date: 2021-05-13
Publication date: 2021-10-12

Abstract

The invention provides a method for detecting weld image defects, which comprises the following steps: constructing a second-order detection network learning model; acquiring radiation image data, inputting the radiation image data into the second-order detection network learning model, and training the second-order detection network learning model to obtain a detection model; adding a deformable convolution module at the detection end of the detection model, and adding an offset domain in the deformable convolution module; and improving the loss function in the model into a variant of the cross entropy function; and detecting the weld defects and types in the weld image to be detected by adopting the improved detection model. The embodiment of the invention utilizes the digital image processing technology to preprocess the image; training a model by utilizing a built second-order Detection network model DR-Detection; and the model is convenient to call and use by packaging and storing the model. The algorithm is a weld image defect detection method based on a second-order detection network, and can reduce artificial workload and improve working efficiency.

Description

Method for detecting weld image defects

Technical Field

The invention relates to the technical field of image target detection, in particular to a method for detecting weld image defects.

Background

With the continuous deepening of informatization in the industrial field, intelligent manufacturing is a great target for industrial development and innovation. The importance of intelligent nondestructive testing evaluation is defined, and the digital radiographic image intelligent identification technology serving as an important branch of intelligent nondestructive testing evaluation is bound to become a key point and a hot point of research of nondestructive testing technology, and has high application value. The existing intelligent analysis method applicable to digital ray detection is mainly a data-driven machine learning and data mining algorithm, and comprises classification, clustering, regression, association rule mining, feature analysis and the like. The application of the above method in the industrial field requires the following problems. Firstly, different material thicknesses and process methods of casting and welding objects can cause various defect types, and related data of the defect types are high in dimensionality, complex in structure and small in sample characteristic. How to efficiently and effectively analyze the large-scale complex data, finding a feature set related to the defect from a large number of features as a label to realize intelligent detection through a computer vision technology is a difficult problem. The invention is focused on the defect detection of industrial digital radiographic images, designs a detection network and solves the actual problem of nondestructive detection in the industrial field.

Disclosure of Invention

The embodiment of the invention aims to provide a method for detecting defects of a weld image, which is dedicated to defect detection of an industrial digital radiographic image, designs a detection network and solves the actual problem of nondestructive detection in the industrial field. The specific technical scheme is as follows:

the embodiment of the invention provides a method for detecting weld image defects, which is applied to a processor and comprises the following steps:

constructing a deep second-order detection network learning model;

acquiring radiation image data, inputting the radiation image data into the deep second-order detection network learning model, and training the deep second-order detection network learning model to obtain a detection model;

adding a deformable convolution module at the detection end of the detection model, and adding an offset domain in the deformable convolution module; and improving the loss function in the model into a variant of the cross entropy function;

and detecting the weld defects and types in the weld image to be detected by adopting the improved detection model.

Further, after the step of detecting the weld defects and the types in the weld image to be detected by using the improved detection model, the method further comprises the following steps: and packaging and storing the improved detection model.

Further, in adding an offset domain to the deformable convolution module, the expression is

In the formula, fixing position p₀+p_n、p₀+p_n+Δp_nIs not fixed at a fixed position,

Is a sampling grid.

Further, Δ p_nComprises two degrees of freedom x and y, and y/x is less than or equal to 2.

Further, the deformable convolution module comprises symmetric convolution and asymmetric convolution; the asymmetric convolution is used for extracting irregular weld defects; the detection model comprises an inclusion-Resnet module and a DropBlock layer.

Furthermore, the DropBlock layer algorithm comprises two parameter blocks_sizeAnd γ; the calculation method of gamma is as follows:

in the formula, feat_sizeSize, keep, of the characteristic map_probRepresenting the probability that the neuron is retained. At the end of the DropBlock algorithm, to ensure that the output scales are consistent, the output is rescaled to compensate for the discarded neurons.

Further, the weld image to be detected includes: including gas hole, crack, incomplete penetration, incomplete fusion weld defect pictures.

Further, the improved detection model comprises: the system comprises a feature extraction and classification module, an RPN network module and an ROI Align module; the feature extraction and classification module comprises: the system comprises a feature extraction network submodule, a feature fusion submodule and a detection unit submodule; the characteristic extraction and classification module is used for extracting and classifying the welding seam image to be detected; the RPN network module is used for identifying the type of the weld defect in the weld image to be detected after the weld image to be detected is processed by the feature extraction and classification module; and the ROI Align module is used for connecting the feature extraction and classification module and the RPN module.

Further, in the RPN network module, the formula of the loss function is:

in the formula, N _cls256 denotes the anchor and the number of candidate boxes participating in the calculation, p_iRepresents the score, y, of the anchor output over the RPN network_iAnd representing a label corresponding to the anchor, wherein the rule of the label is as follows:

here we specify that anchor, which is the largest of the IOU's with the target box and IOU >0.7, is a positive sample, and that IOU <0.3 is a negative sample;

the formula for the location loss of the RPN network is:

in the formula (II)_ROI＝128，

Representing the label corresponding to each ROI, d_iRepresenting the output of the network;

the formula for the loss detection of the RPN network is:

in the formula (II)_clsGet 128, y_icThe label representing the sample in class II, formulated the same rule as the formula for the loss function, except that here the object is the 128 ROIs for the RPN output and the target box is calculated as 0/1, p_icIndicating the probability of a sample belonging to class c, Loss _ F_clsIs the output of the final classification of the network, between 0 and 1.

Further, in the above-mentioned case,

the expression of the variant of the cross entropy function is:

where α is used to balance the importance of each sample, γ adjusts the simple sample weight reduction rate, and when γ is 0, the formula behaves as a cross entropy function; α is 0.25 and γ is 2.

The embodiment of the invention utilizes the digital image processing technology to preprocess the image; training a model by utilizing a built second-order Detection network model DR-Detection; and the model is convenient to call and use by packaging and storing the model. The algorithm is a weld image defect detection method based on a second-order detection network, and can reduce artificial workload and improve working efficiency.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is an algorithm flowchart of a method for detecting a defect in a weld image according to an embodiment of the present invention.

FIG. 2 is a diagram comparing standard convolution and deformable convolution modules.

FIG. 3 is a diagram of an implementation process of the deformable convolution module.

FIG. 4 is a digital radiograph defect aspect ratio histogram.

FIG. 5 is a diagram of an implementation of a deformable convolution module that limits aspect ratios.

FIG. 6 is a diagram comparing a DropBlock module with a dropout module.

Fig. 7 is a block diagram of a feature extraction network.

Fig. 8 is a diagram of a feature extraction network architecture.

FIG. 9 is a representation of the anchor in a feature map.

Fig. 10 is a schematic diagram of a feature fusion network structure.

FIG. 11 is a diagram of a DR-detection network architecture based on second order detection.

FIG. 12 is a diagram of DR-Detection network Detection results.

FIG. 13 is a schematic view of ROI Align.

FIG. 14 is a flowchart of a method for detecting defects in a weld image.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms of art referred to in this example are interpreted:

the Faster R-CNN is mainly divided into two parts:

RPN (region Proposal network) generates high quality region Proposal;

fast R-CNN was tested using region propofol.

The technical solution in the embodiment of the present invention will be described below with reference to the accompanying drawings in the embodiment of the present invention.

The embodiment of the invention aims to provide a method for detecting weld image defects, which aims to solve the problems of low efficiency and low precision in the existing detection method.

Example 1

In a first aspect, referring to fig. 1 and 14, an embodiment of the present invention provides a method for detecting a defect in a weld image, applied to a processor, including:

s110, constructing a deep second-order detection network learning model.

S120, acquiring radiation image data, inputting the radiation image data into the deep second-order detection network learning model, and training the deep second-order detection network learning model to obtain a detection model.

S130, adding a deformable convolution module at the detection end of the detection model, and adding an offset domain in the deformable convolution module; and the loss function within the model is modified to be a variant of the cross-entropy function.

S140, detecting the welding seam defects and types in the welding seam image to be detected by adopting the improved detection model.

Specifically, this embodiment provides an algorithm flow for intelligent detection of weld defects, including:

designing each Detection module, and constructing a deep neural network Detection model DR-Detection through a basic module.

The model is trained using image ray data.

And the model is packaged and stored, so that the next calling and testing are convenient.

Compared with the existing weld defect detection algorithm, the algorithm has the following advantages.

Compared with the existing target algorithm, the method has higher precision on the digital radiographic image.

Traditional object detection algorithms such as fast-rcnn, YOLO, and SSD are mostly based on ImageNet image datasets, which are mainly images from general camera systems. Digital radiographic images have their own characteristics and need to be specially designed for the image characteristics. The DR-Detection finally formed by the invention is a target Detection network specially aiming at the welding seam image, and has higher precision on the welding seam image than the traditional frame and network.

The special network design for the weld images in the algorithm is a set of network model building and training specially oriented to digital radiographic images, and can meet actual requirements.

Example 2

Aiming at the deformable convolution design of long and narrow defects, the deformable convolution technology is adopted, the modeling limitation of the traditional CNN is eliminated through the change of the CNN sampling mode, and the feature extraction capability of the CNN on irregular images is enhanced.

In a two-dimensional convolution, for each point on the output signature, two components are included: and carrying out grid sampling on the input characteristic diagram and operating sampling points in a multiplying and adding mode. The grid extent of this sampling represents the receptive field of the CNN itself, and for a 3 × 3 convolution, for example, this sampling grid can be defined as:

for a point p in the output profile₀In terms of its value, it solves for:

in deformable convolution, it is aimed at sampling grid

There is one offset field corresponding to each value in the field and the grid

Corresponding, is described as { Δ p _n1, …, N }, wherein

Then the output profile solving equation is:

this results in a fixed position p in the sampling mode₀+p_nInto a non-fixed form p₀+p_n+Δp_nWherein Δ p_nIs why the deformable convolution can adapt to irregular objects. And in the feature map, the sampling points, namely p, are always required₀、p_nAnd Δ p_nAre all integers if the sample point p₀+p_n+Δp_nAnd if the pixel is not an integer point pixel, performing interpolation solution by using a bilinear interpolation formula. The difference between the deformable convolution and the standard convolution is shown in fig. 2.

As can be seen from fig. 2, the deformable convolution is more adaptive to the characteristics of the defect itself for irregular long and narrow type defects. For deformable convolution, it is not the convolution kernel that deforms because of the point p in the output feature map₀The sample point in the original is still p₀+p_nThis is a 3 x 3 grid as with the standard convolution, whose true deformation occurs at the learned offset Δ p_nThis allows the sample points to be characterized elsewhere, with positional offsets, so the key to the deformable convolution is how to learn this offset. In the implementation, two layers of standard convolution implementation are adopted, the first layer of convolution is used to learn the offset, then the feature map is readjusted according to the offset, and finally the standard convolution correlation calculation is used, and the flow is described as shown in fig. 3:

for a feature map with an input dimension N, the learned offset is 2, which includes two degrees of freedom: x and y, respectively. The two degrees of freedom are not limited, and the distribution of the two degrees of freedom in the whole two-dimensional space is difficult to learn, the invention combines the characteristics of digital rays to limit the two degrees of freedom and reduce the searching cost of the two degrees of freedom in the two-dimensional space, and the statistical analysis is carried out on the width-to-length ratio of defects to obtain:

from FIG. 4 it can be seen that the width to length ratio of the defect is much less than 2, and therefore the present invention limits Δ p_nThe ratio y/x of the two degrees of freedom x and y is less than or equal to 2, namely the two degrees of freedom are independently processed, aiming at the long and narrow characteristic of the welding seam image, the invention mainly limits the length x in code realization, and the flow is described as followsAs shown in fig. 5.

This activation function we denote as:

example 3

The invention provides a network suitable for digital radiographic image feature extraction.

The data dimension can be well expanded by an inclusion series network in a plurality of network structures, the ResNet network is designed aiming at the network degradation and the network depth deepening, the feature dimension is expanded by combining the two, the network layer number can be deepened, and the deeper features can be extracted. The network designed by the invention uses an increment-Resnet module, and in order to enhance the feature extraction of the network on the unwelded through type long and narrow defects, a B module is also used, wherein the B module comprises convolution of 1 x 7 and 7 x 1, and the asymmetric convolution is adopted to facilitate the feature extraction on irregular defects. On the basis of the original module, considering that the digital radiographic image has simple semantics, in order to avoid overfitting, a DropBlock layer is embedded in the module. The traditional dropout layer avoids overfitting by discarding neurons with a certain probability, and this method is often used in a fully connected layer. The convolutional layer utilizes a receptive field mechanism of a human visual system, namely, each position in the feature map has a receptive field, only one pixel is discarded, but a small receptive field is discarded, but other senses also exist, and the pixel value can be inferred from other receptive fields, so that the behavior of discarding only one pixel is considered as the behavior of discarding only one pixel cannot reduce the feature range for the convolutional layer, and the network can learn lost semantic information through neighborhood information, so that the network robustness cannot be increased. To do this, DropBlock takes the whole block of pixel dropping, which is essentially zeroing out the features of the store, to force the web to learn more robust features. A comparison of DropBlock with the traditional dropout is shown in FIG. 6:

in fig. 6, (a) shows a conventional dropout, and a green box shows a feature of a target, and (b) shows DropBlock, which masks features of a square around a selected point. Its specific algorithm is shown in table 1:

TABLE 1 DropBlock Algorithm

There are two parameter blocks in DropBlock algorithm_sizeAnd gamma, the proposed block given in the present invention_sizeIs 7 and gamma is calculated by first determining the ratio at which the neuron is retained and then calculating the bernoulli distribution parameter that the neuron should have.

After the DropBlock structure is added, the finally used module is shown in FIG. 7;

the final feature extraction network structure is shown in fig. 8, the size and channel of the feature map are not changed in the module a and the module B, the down-sampling of the image and the change of the channel number are realized by the convolution of 3 × 3, the × 2 in fig. 8 indicates that the structure is repeated twice, and the first convolution kernel of 3 × 3 does not change the channel number when the module is repeated, and the second convolution kernel of 3 × 3 changes the channel number. Finally, the global pooling layer is used to replace the full connection layer to tile the features for Softmax classification,

the present invention places the modified deformable convolution layer on the last feature map because the use of deformable convolution is more demanding on the feature map and it is expected that the bias of the feature map itself will be learned only for well-defined feature maps.

Example 4

The multi-scale feature fusion method aiming at the defects of the digital radiographic image comprises an anchor size design method and a feature pyramid-based feature fusion method.

Firstly, explaining a method for designing the size of the anchor, the invention divides different anchors into characteristic diagrams with different sizes, so that different characteristic diagrams can detect different targets aiming at different anchors. Based on the principle that a large feature map always has a smaller receptive field, the feature scaling of an image is not obvious and is suitable for detecting a feature map with a small size, and a feature map with a small size has a larger receptive field and is suitable for detecting a large target, the method divides the anchors into 3 groups which respectively correspond to feature maps with different sizes:

TABLE 2 correspondence between characteristic diagrams and anchors

Compared with the original fast-rcnn method that 9 preset anchors are used at each pixel point, the method presets 2 anchors at each pixel point, so that the workload can be greatly reduced, the anchors are set for defect analysis of digital radiographic images, the anchors are more in line with the research scene of the method, and the regions where the defects are located can be extracted to serve as candidate regions. The anchor generated by clustering analysis gathers the defect knowledge information, belongs to prior knowledge, and the exploration and adjustment on the prior knowledge are more efficient than the search according to the spatial structure of the solution. In order to show the effect of processing different-size features by different feature maps, a small grid in an image represents a feature unit, and the appearance of an anchor in the feature map is shown in fig. 9.

Then, a feature pyramid-based feature fusion method is explained. In the invention, the addition of the feature map fusion mode is changed into splicing, the original feature channel is changed into a half by 1 × 1 convolution, the associated information from different feature maps is learned by 1 × 1 convolution after the feature maps are spliced together, thus the position information and the semantic information are fully reserved under the condition of not increasing the calculated amount, and the finally used feature fusion structure chart 10 is shown.

The invention performs up-sampling by a bilinear interpolation method, and then corrects the sampled characteristics by a convolution of 3 multiplied by 3, thereby increasing the characteristic details. As can be seen from the above diagram, the feature map of the second layer in the feature extraction network is merged with the feature map of the last layer after being convolved by two 1 × 1 volumes and 2 3 × 3 volumes, and compared with the feature extraction network which needs 2 blocks B, 2 3 × 3 volumes and 1 deformable convolution module, the path is shorter, which is more favorable for the flow of position information. Specific implementation of the Block module and the deformable convolution module in fig. 10 referring to the related content of example 2, three feature maps P with different sizes but 256 channels are finally output₁、P₂And P₃。

Example 5

A DR-Detection network for the digital radiographic image is constructed based on the feature extraction network.

The feature extraction framework network of the invention uses the feature extraction network proposed in embodiment 2, on this basis, the invention performs feature fusion by applying an improved feature pyramid, then applies an attention mechanism to an RPN network to improve Detection accuracy, and in order to reduce position deviation caused by integer quantization, uses ROI Align to replace original ROI posing, and finally uses a full connection layer to perform classification and positioning, the overall structure of the network is shown in fig. 11, wherein each module is designed in detail in relevant sections, and the invention is called as a DR-Detection network, and finally end-to-end Detection is realized, and the Detection result is shown in fig. 12.

The whole network architecture can be seen as comprising three parts, wherein one part is the feature extraction and classification formed by the feature extraction network and the feature fusion and detection unit, the other part is the relatively independent RPN network, and the ROI Align part is used for connecting the two parts. Since the ROI Align itself contains no trainable parameters, the result of the RPN and the result of the detection unit mainly affect the accuracy of the network, so the loss function of the network mainly consists of two parts: RPN loss and detection loss. ROIAlign, RPN loss, and detection loss are explained below.

1) ROIAlign component

The ROI Align is a structure which is proposed for Mask RCNN and used for improving image segmentation accuracy, and maps candidate frame features into feature position errors output in a fixed dimension by optimizing two rounding operations in ROI posing. In the invention, floating point numbers are reserved for the pixels of the corresponding feature map after the candidate frame is downsampled, and rounding operation is not carried out. As shown in the following figure, for 56/16 ═ 3.5, the 3.5 unknown eigenvalue is obtained by the bilinear interpolation method of equations 3-13 using the four integer points adjacent to the unknown eigenvalue, while for the 3.5 × 3.5 eigen region divided into 2 × 2 grids, four sampling points are generated, the eigenvalues of the four points are also calculated by the bilinear interpolation method, and finally 5 sampling points in each grid are averaged to obtain the eigenvalue of the grid.

According to the ROI Align diagram of fig. 13, the present invention performs the calculation of dividing the input features of different sizes into 4 × 4 grid, 2 × 2 grid and 1 × 1 grid, and if the channels of the feature map are all 256, the feature of any size is converted into (4 × 4+2 × 2+1 × 1) × 256 ═ 5376 dimension, which will be used for classification and further adjustment of the output candidate frame.

2) Loss of RPN

The RPN aims to accurately distinguish whether the anchor belongs to the foreground or the background, and primarily adjusts the anchor to enable the anchor to be closer to a real target position, so that the loss of the part comprises two parts, one part is classified loss, and the anchor is judged to be foreground and background accuracy; another part is the loss of position fix, the accuracy of the adjustment to the true position for the anchor.

Although the binary classification in RPN is for all anchors, it actually affects the result by the candidate frame that is eventually entered as input into ROI Align, so the classification penalty only computes the candidate frame with RPN _ batch size 256 that is finally generated. The partial calculation formula is as follows:

here we specify that anchor with IOU max of the target box and IOU >0.7 is a positive sample, and that IOU <0.3 is a negative sample, others are not considered.

When the positioning loss of the RPN is calculated, only the candidate frame which divides the anchor into the foreground is used, because the negative samples do not participate in the following detection and positioning, each anchor corresponds to four adjusting variables which respectively represent the horizontal and vertical coordinates, the width and the height of the central point of the anchor to be adjusted, and the corresponding adjusting value of each anchor and the target frame with the maximum IOU can be inversely deduced according to the following formula:

the four variables that can be used to calculate the adjustment target for each anchor represent continuous variables, where the regression loss is calculated using the Smooth L1 loss, using the formula:

the localization loss of this ROI will be followed by a corresponding addition of four variables, and the localization loss of the RPN network can be expressed as:

in the formula (II)_ROI＝128，

Representing the label corresponding to each ROI, d_iRepresenting the output of the network.

3) Detecting loss

The ROI passed down the RPN network has half 128 of RPN _ batch size, and the classification for this part is a multi-classification problem, and the penalty for this part is defined as:

in the formula (II)_clsGet 128, y_icThe label representing sample i in class c, formulated the same rule as the formula for the loss function, except that here the object is 128 ROIs for the RPN output and the target box is calculated as 0/1, p_icThe probability that the sample i belongs to class c is the output of the final classification of the network, between 0 and 1.

The invention also uses the Focal loss for mining samples difficult to learn in a second-order detection framework, the loss function attributes the problem of unbalance of positive and negative samples to a large number of simple samples, namely background classes, trained in a classifier, and promotes model learning of samples difficult to classify by giving the samples small loss weight and giving south classification samples large loss weight, and the loss function belongs to a variant of a cross entropy function and is expressed as:

where α is used to balance the importance of each sample, γ adjusts the simple sample weight reduction rate, and when γ is 0, the equation behaves as a cross-entropy function. According to the research, the invention takes alpha as 0.25 and gamma as 2.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for detecting weld image defects is applied to a processor and is characterized by comprising the following steps:

constructing a deep second-order detection network learning model;

2. The method for detecting the weld image defects according to claim 1, further comprising the following steps after the step of detecting the weld defects and types in the weld image to be detected by using the improved detection model: and packaging and storing the improved detection model.

3. The method for detecting weld image defects according to claim 1, wherein in the deformable convolution module, an offset domain is added with the expression of

In the formula, fixPosition p₀+p_n、p₀+p_n+Δp_nIs not fixed at a fixed position,

Is a sampling grid.

4. The weld image defect detection method according to claim 3, wherein Δ p is_nComprises two degrees of freedom x and y, and y/x is less than or equal to 2.

5. The weld image defect detection method according to claim 1, wherein the deformable convolution module comprises a symmetric convolution, an asymmetric convolution; the asymmetric convolution is used for extracting irregular weld defects; the detection model comprises an inclusion-Resnet module and a DropBlock layer.

6. The weld image defect detection method according to claim 5, wherein the DropBlock layer algorithm comprises two parameter blocks_sizeAnd γ; the calculation method of gamma is as follows:

in the formula, feat_sizeSize, keep, of the characteristic map_probRepresenting the probability of a neuron being retained; at the end of the DropBlock algorithm, to ensure that the output scales are consistent, the output is rescaled to compensate for the discarded neurons.

7. The weld image defect detection method according to claims 1-6, wherein the weld image to be detected comprises: including gas hole, crack, incomplete penetration, incomplete fusion weld defect pictures.

8. The weld image defect detection method according to the claims 1 to 6, wherein the improved detection model comprises the following steps: the system comprises a feature extraction and classification module, an RPN network module and an ROI Align module; the feature extraction and classification module comprises: the system comprises a feature extraction network submodule, a feature fusion submodule and a detection unit submodule; the characteristic extraction and classification module is used for extracting and classifying the welding seam image to be detected; the RPN network module is used for identifying the type of the weld defect in the weld image to be detected after the weld image to be detected is processed by the feature extraction and classification module; the ROIAlign module is used for connecting the feature extraction and classification module and the RPN module.

9. The method for detecting weld image defects according to claim 8, wherein in the RPN network module, the formula of the loss function is as follows:

in the formula, N_cls256 denotes the anchor and the number of candidate boxes participating in the calculation, p_iRepresents the score, y, of the anchor output over the RPN network_iAnd representing a label corresponding to the anchor, wherein the rule of the label is as follows:

the formula for the location loss of the RPN network is:

in the formula (II)_ROI＝128，

Showing each ROI pairCorresponding label, d_iRepresenting the output of the network;

the formula for the loss detection of the RPN network is:

10. The weld image defect detection method according to claims 1 to 6,

the expression of the variant of the cross entropy function is: