CN116258175A

CN116258175A - Weld defect intelligent recognition model evolution method based on active learning

Info

Publication number: CN116258175A
Application number: CN202211731003.3A
Authority: CN
Inventors: 张焕群; 赵真; 赵阳; 付明芮
Original assignee: Shenyang Paidelin Technology Co ltd
Current assignee: Shenyang Paidelin Technology Co ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-06-13

Abstract

The invention provides an intelligent weld defect recognition model evolution method based on active learning, and relates to the technical field of artificial intelligent active learning. Firstly, establishing an initial model, carrying out sample marking on a digitized weld negative image, and sending the digitized weld negative image into a target detection model for training iteration to obtain a primary detector; secondly, inputting the detected result into a value sample mining module, and actively screening out a defect sample which influences the detection rate of the model and is easy to produce false detection; thirdly, actively labeling the mined value samples, and establishing a value sample labeling database; and finally, retraining all data sets, and realizing the evolution upgrading of the model by fine tuning model parameters. The method solves the problem that the traditional fixed model has low unknown defect detection capability, and greatly improves the intelligent recognition capability of the long-distance pipeline radiographic image.

Description

Weld defect intelligent recognition model evolution method based on active learning

Technical Field

The invention relates to the technical field of artificial intelligence active learning, in particular to an evolution method of an intelligent weld defect recognition model based on active learning.

Background

The great development of oil gas pipeline construction brings challenges to the quality control of long-distance pipeline connection, the ray detection technology can intuitively display the shape, the size and the distribution of internal defects of a material structure, has unique advantages in the aspect of qualitative defect detection, and is a nondestructive detection technology widely applied to the industrial field. Pipeline digitization comprises informatization, networking, intellectualization and visualization of pipeline data, the application and development of the current artificial intelligence field in the industrial image field are in progress, the machine learning technology is utilized to promote the intellectualization and automation of long-distance pipeline defect identification, the construction of the digitized pipeline can be accelerated, and the safety and reliability of energy transportation engineering are effectively improved.

At present, a large number of intelligent defect identification methods and inventions in the weld quality evaluation process are proposed, and the new technologies greatly promote the intelligent process and improve the defect identification accuracy to a certain extent. However, given the wide variety of weld defects, the welding industry is complex, with current methods being based primarily on fixed deep learning models. The fixed defect intelligent model does not have the capability of evolution and performance improvement, so the intelligent level of unknown defect detection is weakened.

In order to improve the efficiency and accuracy of weld quality assessment, based on the above analysis and the limitations of the current technology, we propose an evolutionary weld defect detection model, so as to accelerate the process of realizing digitization and intellectualization of X-ray images.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an intelligent weld defect recognition model evolution method based on active learning.

In order to solve the technical problems, the invention adopts the following technical scheme:

step 1: sending the weld joint radiographic image scanned by the digital negative film and the corresponding defect label into a detection network for training and optimizing to obtain an initial detection model;

step 1.1, marking a weld defect training set based on a digital image, giving pixel coordinates of defects according to a rectangular frame form, and forming the training set:

wherein n represents the number of training set samples, for any given training set image

The defect instance set in any image can be expressed as +.>

Instance labels for each image are noted as

Wherein k represents the number of defects in the image, and the superscript L represents the labeled data;

step 1.2 training set

Data enhancement and augmentation are carried out;

the generalization capability of the model is improved by expanding training samples, the diversity and complexity of samples in unknown scenes are simulated, and the detection precision of the initial model is greatly improved. The image enhancement module adopts Gaussian noise, self-adaptive brightness adjustment, image translation, image folding, image scale scaling and multi-image stitching modes, and adopts the same operation on the instance labels, thereby greatly expanding the data set

Wherein N is the number of defects in the weld image after the data set is expanded;

step 1.3, inputting the weld image into a full convolution detection network in batches, extracting characteristic representation and outputting predicted class probability

Is +.>

Step 1.4 loss L by optimization _det Using SGD optimizer, the initial detection model is kept stable, loss L by evaluating constant loss fluctuations _det The method comprises two parts, namely classification loss and regression loss, as shown in a formula (1):

wherein BCE (·) is a binary classification cross entropy loss function, DIoU (·) is a bounding box regression loss function,

represents x _i Class probability of tag->

Represents x _i A frame of the label;

according to the loss value calculated by the formula (1), the optimizer SGD (·) optimizes the network global parameter Θ in the iterative process until the loss function is minimum, as shown in the formula (2):

setting the upper limit e of the training wheel number, stopping the training process when the training wheel number reaches e, and obtaining a primary defect detection model with defect classification and positioning information of reasoning unknown images

Step 2: detection results based on model reasoning, i.e. predicted class probabilities

And target position->

Designing a multi-value sample mining strategy; respectively considering the uncertainty rule based on information entropy to mine a high confidence coefficient to determine a defect sample and a low confidence coefficient to blur the sample, and considering the high similarity (smaller inter-class difference) among weld defects, designing a classification blur sample query rule based on the boundary uncertainty rule, and finally considering the frequency of defect occurrence based on a Bayesian strategy with new posterior statistics to mine the sample;

step 2.1, excavating a determination sample with higher confidence, and adopting a formula for sampling with high confidence for defect cracks, unfused defects and strip defects with larger harm in weld defect identification:

conf(cls _i )>δ _h (3)

wherein ,cls_i Representing crack, bar and unfused defect bounding boxes, conf (·) is the confidence of the category, chosen greater than δ _h As a value sample, delta _h To set a threshold. By excavating a large number of determined samples, the automatic excavation of the high-credibility damage defect samples is realized, so that the automation level of labeling is improved;

step 2.2, mining information Entropy Samples with large feature judgment divergence, wherein the formula of information Entropy sampling (Entropy Samples) is as follows:

wherein n is the defect class number, p _i Is the i-th probability, and the products of all probabilities and logarithms thereof are summed to obtain the expected x of the uncertainty of the category _e The information entropy considers the probability conditions of all the categories, and solves the problem that the confidence between the categories is not obvious;

step 2.3, for the problem of high similarity between classes of long-distance pipeline weld defects, boundary sampling considers two classes with similar prediction probabilities in a boundary box, and excavates such samples as follows:

conf(cls _i )-conf(cls _j )<δ _l (5)

wherein ,cls_i ，cls _j Representing defects of the ith and jth classes, delta _l The difference threshold is used for screening out defect types with higher category similarity.

In the process of excavating the value Samples in batches, counting the frequency of defect categories accounting for the total Samples, and calculating posterior probability under category frequency by taking the frequency of the defect categories as an influence factor of category confidence, wherein the formula of Bayes sampling (Bayes Samples) is as follows:

wherein conf (·) is the confidence, equation (4) represents the factor of influence of the category confidence, N is the defect category number, and N represents the total number of samples; in order to prevent the problem that the influence factors are unstable when the mining starts, c is set as a category base, the influence factors are smaller when the category occurrence number is larger, the Bayesian sampling focuses on small category samples with few category occurrence numbers, and the problem of unbalanced samples is solved.

Step 3, the unmarked weld joint image

Feeding into an initial defect detection model->

Classifying the pipeline weld defects and positioning the frame to obtain an unsupervised detection result; inputting the detected result into a value sample mining strategy module in the step 2, and actively screening out a defect sample which influences the detection rate of the model and is easy to be detected by mistake;

step 3.1, the height of the digital negative film is h, whether a single or a plurality of defect marking frames can be cut into h multiplied by h sub-images or not is judged, and the h multiplied by h sub-images are cut into a plurality of h multiplied by h resolution sub-images in a tail zero filling mode;

step 3.2 inputting the plurality of sub-images into the defect detection model

Deducing the category information and the position information of the defects, inputting the category information and the position information into the four sampling strategies in the step 2, and summarizing to obtain valuable weld negative films, so as to realize active mining of samples.

Step 4, in the excavated value sample, sectionally cutting the defects on the digitized negative film of the long-distance pipeline, so that the target detection model can pertinently collect the characteristics of the defects; manually marking the cut long-distance pipeline small pieces by expert, correcting the false detection result, marking the undetected defects to form a value sample with real marking, and based on a defect detection model

Performing fine tuning and retraining to obtain->

Completing one-time evolution of the model;

step 4.1 the valuable weld negative obtained in step 3.2 was noted as

Active labeling is manually carried out by a nondestructive testing expert, and a value sample labeling data set is obtained>

Step 4.2, updating and fine-tuning the detection model; freezing model

To accelerate convergence, i.e

wherein ,

wherein the weights for preparing the update are represented, Θ _r Representing the weight of the freeze;

step 4.3 initial data set to be annotated

And mining value sample data sets

Summarizing, then losing L through optimization _det Using an SGD optimizer, preserving a stable initial detection model by evaluating constant loss fluctuations; loss L _det The method comprises the following steps of respectively detecting loss of an initial data set and detecting loss of a mined sample data set, wherein the loss is shown in a formula (9):

wherein ,

for model evolution loss, ++>

Loss terms for the initial dataset and the value sample, respectively; identification model->

Is in the initial dataset +.>

Obtained by training, thus->

A great deal of knowledge of the original dataset is already available, but for mining the sample dataset

The related knowledge is lacked more; to addThe learning of the strongly excavated sample increases the loss specific gravity of the excavated sample, namely lambda _s The assigned weight will be greater than lambda _l The method is larger, so that the network learns more knowledge from the mined samples, and the evolutionary upgrading of the model is realized; the optimizer SGD (& gt) optimizes the network global parameter theta in the iterative process until the loss function is minimum, and stops the training process when the training reaches the set round number e, so that an evolution model & gt with the defect classification and positioning information of reasoning unknown images can be obtained>

The beneficial effects of adopting above-mentioned technical scheme to produce lie in: the invention provides an intelligent weld defect recognition model evolution method based on active learning, which can effectively mine potential and high-value samples, and fine-tune model parameters by using the high-value samples to realize model evolution, so that the intelligent defect recognition model has higher precision.

Drawings

FIG. 1 is a diagram of a general architecture of a method according to an embodiment of the present invention;

FIG. 2 is a diagram of a network architecture provided by an embodiment of the present invention;

fig. 3 is a value sample mining structure diagram provided in an embodiment of the present invention.

Detailed Description

The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.

The overall method of this embodiment is as shown in fig. 1, in which a YOLOv5 network is used as a baseline model, and an initial training set and a pre-training weight are input into the model to obtain a predicted set; the input is collected and filtered by a sampling strategy, high-value samples are screened out, then a training set updated by the high-value samples is obtained through inquiry and labeling of professionals in the field, a label-free sample pool and pre-training weights are updated, and the model is retrained and fine-tuned to obtain the evolved model. The partial network structure of YOLOv5 is shown in fig. 2. The specific implementation method is as follows:

step 1: sending the weld joint radiographic image scanned by the digital negative film and the corresponding defect label into a detection network for training and optimizing to obtain an initial detection model, wherein each operation of the model is performed in a PC environment shown in the table below;

The defect instance set in any image can be expressed as +.>

The example label of each image is denoted +.>

Wherein k represents the number of defects in the image, and the superscript L represents the data with Label (Label);

step 1.2 training set

Data enhancement and augmentation are carried out;

the generalization capability of the model is improved by expanding training samples, the diversity and complexity of samples in unknown scenes are simulated, and the detection precision of the initial model is greatly improved. The image enhancement module adopts the modes of Gaussian noise, self-adaptive brightness adjustment, image translation, image folding, image scale scaling, multi-image splicing and the like, and executes the same operation on the instance label, thereby greatly expanding the data set

Is +.>

represents x _i Class probability of tag->

Represents x _i A frame of the label;

setting the upper limit e of the training wheel number, in this embodiment, taking e=150, so that the training process is stopped when the training wheel number reaches e, and defect classification and determination with unknown image reasoning can be obtainedPrimary defect detection model for bit information

And target position->

Designing a multi-value sample mining strategy, as shown in fig. 3; respectively considering the uncertainty rule based on information entropy to mine a high confidence coefficient to determine a defect sample and a low confidence coefficient to blur the sample, and considering the high similarity (smaller inter-class difference) among weld defects, designing a classification blur sample query rule based on the boundary uncertainty rule, and finally considering the frequency of defect occurrence based on a Bayesian strategy with new posterior statistics to mine the sample;

conf(cls _i )>δ _h (3)

wherein ,cls_i Representing crack, bar and unfused defect bounding boxes, conf (·) is the confidence of the category, chosen greater than δ _h As a value sample, delta _h To set the threshold, the present embodiment takes δ _h =0.9. By excavating a large number of determined samples, the automatic excavation of the high-credibility damage defect samples is realized, so that the automation level of labeling is improved;

wherein n is the defect class number, p _i Probability of being the i-th class, forThe product of the probabilities and their logarithms is summed to obtain the class uncertainty expectation x _e The information entropy considers the probability conditions of all the categories, and solves the problem that the confidence between the categories is not obvious;

conf(cls _i )-conf(cls _j )<δ _l (5)

wherein ,cls_i ，cls _j Representing defects of the ith and jth classes, delta _l Is the difference threshold, delta is taken in this embodiment _l =0.1, for screening defect types with high category similarity.

Step 3, the unmarked weld joint image

Feeding into an initial defect detection model->

step 3.2 inputting the plurality of sub-images into the defect detection model

Performing fine tuning and retraining to obtain->

Completing one-time evolution of the model;

step 4.1 the valuable weld negative obtained in step 3.2 was noted as

Step 4.2 update and Fine tuning detectionA model; freezing model

To accelerate convergence, i.e

wherein ,

step 4.3 initial data set to be annotated

And mining value sample data set->

wherein ,

for model evolution loss, ++>

and

Loss terms, lambda, for the initial dataset and the value sample, respectively _l and λ_s Taking 0.5 and 1 respectively; due to the defectIdentification model->

Is in the initial dataset +.>

Obtained by training, thus->

A large amount of knowledge of the initial dataset is already available, but +.>

The related knowledge is lacked more; to enhance the learning of the excavated sample, the loss specific gravity of the excavated sample, i.e., lambda, is increased _s The assigned weight will be greater than lambda _l The method is larger, so that the network learns more knowledge from the mined samples, and the evolutionary upgrading of the model is realized; the optimizer SGD (& gt) optimizes the network global parameter theta in the iterative process until the loss function is minimum, and stops the training process when the training reaches the set round number 150, so that an evolution model & gt with the defect classification and positioning information of reasoning unknown images can be obtained>

Comparing the model obtained by the invention with a DefectNet method and a TAS-Net on three indexes of precision, recall and average accuracy, wherein the precision represents the ratio of all the correct prediction results to the total prediction, the recall represents the ratio of all the correct prediction results to the total label, and the average accuracy is an consideration of the comprehensive precision and recall. The calculation results of the indexes are shown in the following table:

finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions, which are defined by the scope of the appended claims.

Claims

1. An intelligent weld defect recognition model evolution method based on active learning is characterized by comprising the following steps:

step 1, sending a weld joint ray image scanned by a digital negative film and a corresponding defect label into a detection network for training and optimizing to obtain an initial detection model;

The defect instance set in any image can be expressed as +.>

The example label of each image is denoted +.>

step 1.2 training set

Data enhancement and augmentation are carried out;

by expanding training samplesThe generalization capability of the model is increased originally, the diversity and complexity of samples in unknown scenes are simulated, and the detection precision of the initial model is greatly improved; the image enhancement module adopts Gaussian noise, self-adaptive brightness adjustment, image translation, image folding, image scale scaling and multi-image stitching modes, and adopts the same operation on the instance labels, thereby greatly expanding the data set

Is +.>

represents x _i Class probability of tag->

Represents x _i A frame of the label;

Step 2 detection results based on model reasoning, i.e. predicted class probabilities

And target position->

Designing a multi-value sample mining strategy; respectively considering the uncertainty rule mining high confidence coefficient determining defect samples and low confidence coefficient fuzzy samples based on information entropy, and considering the high similarity among weld defects, designing a classification fuzzy sample query rule based on the boundary uncertainty rule, and finally considering the frequency of defect occurrence based on a Bayesian strategy with new posterior statistics to carry out sample mining;

step 3, the unmarked weld joint image

Feeding into an initial defect detection model->

Classifying the pipeline weld defects and positioning the frame to obtain an unsupervised detection result; inputting the detected result into a value sample mining strategy module in the step 2A block for actively screening out defect samples which influence the detection rate of the model and are easy to be detected by mistake;

step 3.2 inputting the plurality of sub-images into the defect detection model

Deducing the category information and the position information of the defects, inputting the category information and the position information into the four sampling strategies in the step 2, and summarizing to obtain valuable weld negative films, so as to realize active mining of samples;

Performing fine tuning and retraining to obtain->

And (5) completing one-time evolution of the model.

2. The method for intelligently identifying and evolving a model of a weld defect based on active learning according to claim 1, wherein the step 2 specifically comprises the following steps:

step 2.1, excavating a defect sample with larger hazard, and collecting the sample by adopting a high-certainty sampling method;

step 2.2, collecting a sample by adopting an information entropy sampling method for mining a defect sample with large model reasoning divergence;

step 2.3, excavating a defect sample with high similarity, and collecting the sample by adopting a boundary sampling method;

and 2.4, taking the probability that the number of the defect categories accounts for the total sample into consideration, and collecting the sample by adopting a Bayesian sampling method.

3. The method for intelligently identifying the weld defects and evolving the model based on the active learning according to claim 2, wherein the step 2.1 specifically comprises the following steps:

the method comprises the steps of excavating a determination sample with higher confidence, and adopting a formula with high-confidence sampling for defect cracks, unfused defects and strip defects with higher harm in weld defect identification:

conf(cls _i )＞δ _h (3)

wherein ,cls_i Representing crack, bar and unfused defect bounding boxes, conf (·) is the confidence of the category, chosen greater than δ _h As a value sample, delta _h Setting a threshold value; by excavating a large number of determined samples, the automatic excavation of the high-credibility damage defect samples is realized, so that the automation level of labeling is improved.

4. The method for intelligently identifying the weld defects and evolving the model based on the active learning according to claim 2, wherein the step 2.2 specifically comprises the following steps:

the information entropy sample with larger feature judgment divergence is mined, and the information entropy sampling formula is as follows:

wherein n is the defect class number, p _i Is the i-th probability, and the products of all probabilities and logarithms thereof are summed to obtain the expected x of the uncertainty of the category _e The information entropy considers the probability of all categories, and solves the problem that the confidence between the categories is not obvious.

5. The method for intelligently identifying the weld defects and evolving the model based on the active learning according to claim 2, wherein the step 2.3 specifically comprises the following steps:

for the problem of high similarity among the long-distance pipeline weld defects, boundary sampling considers two categories with similar prediction probabilities in a boundary box, and excavates the samples, as follows:

conf(cls _i )-conf(cls _j )＜δ _l (5)

6. The method for intelligently identifying the weld defects and evolving the model based on the active learning according to claim 2, wherein the step 2.4 specifically comprises the following steps:

in the process of excavating value Samples in batches, counting the frequency of defect categories accounting for the total Samples, and calculating posterior probability under category frequency by taking the frequency of the defect categories as an influence factor of category confidence, wherein the formula of Bayes sampling (Bayes Samples) is as follows:

7. The method for intelligently identifying and evolving a model of a weld defect based on active learning according to claim 1, wherein the step 4 specifically comprises the following steps:

step 4.1 the valuable weld negative obtained in step 3.2 was noted as

Step 4.2, fine tuning and updating a detection model by using the acquired value sample data set;

step 4.3 initial data set to be annotated

And mining value sample data sets

wherein ,

for model evolution loss, ++>

and

Is in the initial dataset +.>

Obtained by training, thus->

The related knowledge is lacked more; to enhance the learning of the excavated sample, the loss specific gravity of the excavated sample, i.e., lambda, is increased _s The assigned weight will be greater than lambda _l The method is larger, so that the network learns more knowledge from the mined samples, and the evolutionary upgrading of the model is realized; the optimizer SGD (& gt) optimizes the network global parameter theta in the iterative process until the loss function is minimum, and stops the training process when the training reaches the set round number e, so that an evolution model & gt with the defect classification and positioning information of reasoning unknown images can be obtained>

8. The method for intelligently identifying the weld defects and evolving the model based on the active learning according to claim 7, wherein the step 4.2 specifically comprises the following steps:

updating and fine tuning the detection model; freezing model

To accelerate convergence, i.e

wherein ,

wherein the weights for preparing the update are represented, Θ _r Indicating the weight of the freeze. />