CN114708518A - Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy - Google Patents

Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy Download PDF

Info

Publication number
CN114708518A
CN114708518A CN202210378734.8A CN202210378734A CN114708518A CN 114708518 A CN114708518 A CN 114708518A CN 202210378734 A CN202210378734 A CN 202210378734A CN 114708518 A CN114708518 A CN 114708518A
Authority
CN
China
Prior art keywords
bolt
model
defect detection
training
semi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210378734.8A
Other languages
Chinese (zh)
Inventor
李刚
张运涛
汪文凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN202210378734.8A priority Critical patent/CN114708518A/en
Publication of CN114708518A publication Critical patent/CN114708518A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses a bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategies, which comprises the following steps: acquiring bolt images of different parts, and establishing bolt detection data sets of different types of defects; based on a priori knowledge embedding strategy, performing feature processing on the bolt detection data set to generate a sample data set with sample correlation features; constructing a variational self-encoder network model, wherein the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit; inputting a sample data set into the variational self-encoder network model for training in a semi-supervised learning mode based on the variational self-encoder network model, and constructing a bolt defect detection model, wherein the bolt defect detection model is used for identifying the bolt defect type of a bolt image to be detected; the method fully utilizes the correlation and the dependency of samples of different types to improve the accuracy of the bolt defect detection under long tail distribution.

Description

Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy
Technical Field
The invention relates to the field of computer vision, in particular to a bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy.
Background
The transmission line is used as a life line in power system transmission, and the stable operation of the transmission line has a vital influence on the safety of a power grid. The traditional manual inspection mode can not meet the requirements of the current society. Therefore, how to utilize the computer vision technology to automatically and accurately locate and accurately detect the defects from the aerial patrol images of the unmanned aerial vehicle becomes a key technical problem. The recognition and classification efficiency of a classic machine learning algorithm on the transmission line bolt is low, and the bolt defect characteristics need to be manually extracted by a processing method based on machine learning, so that the universality of the method is not strong. At present, the mainstream target detection algorithms include Faster R-CNN and Yolo series, and model training based on the two deep learning algorithms depends on a large amount of labeled data. However, since the labeling work is time-consuming and labor-consuming, it is expensive to obtain a large amount of labeled data in actual production. In addition, the imbalance of the training sample class data samples can also cause the final training effect of the deep learning model to be extremely poor.
In the prior art, machine learning models can be classified into 3 types, namely 'supervised learning', 'unsupervised learning' and 'semi-supervised learning', according to whether labeled data are used in the training process. The key to whether a target detection algorithm based on supervised learning can achieve good performance is whether sufficient labeled sample data is available. However, in many scenarios such as power generation, the total amount of data of critical defects is small, and the cost of image annotation is high, so that the requirement for acquiring a large amount of annotated data sets is difficult to meet. The unsupervised learning completely does not utilize the labeled data, and the relation between the samples is found by mining the inherent characteristics of the data. However, the unsupervised learning algorithm lacks prior knowledge, so that the accuracy of an output result cannot be predicted, and the unsupervised learning algorithm is difficult to apply to actual production. Semi-supervised learning utilizes unlabeled data and labeled data to perform collaborative training simultaneously. The unmarked data can be marked with a label in the semi-supervised learning model and then used as marked data, so that the data set is expanded, and the quality of the model is improved. Then, in actual work, the quality of the unlabeled data is difficult to control, and the improper use of the unlabeled data can bring great influence on the performance.
In view of the above problems, a method for detecting a target based on semi-supervised learning is currently used, for example, the university of science and technology in china discloses a method for automatically labeling an image of an appearance defect of an industrial product based on semi-supervised learning (application number: cn202010804831.x), in which a trained deep convolutional network classification model is used to label unknown label data and apply a pseudo label to the unknown label data. And then, extracting a predetermined amount of data from the pseudo label data and combining the data with the known label data to form a new training set to iteratively train the automatic labeling model. Because the labeling result of the method depends on the precision of the deep learning model trained by the labeled sample at the early stage to a great extent, if the pseudo label carries too much noise, the accuracy of the final model of the semi-supervised training cannot be ensured. The Shenzhen Limited company also discloses a target detection method, a target detection device and a storage medium (application number: 202011288652.1) based on semi-supervised learning, and the method improves the detection generalization capability of the model by taking data expanded by operations such as labeled data horizontal inversion, data enhancement and the like as new training samples. However, this method severely limits the availability of unlabeled samples, and cannot fully utilize the information contained in a large number of unlabeled samples to help improve the model accuracy.
Secondly, the deep learning model cannot learn the difference between the class a sample and the class B sample at present, the link between the class C sample and the class B sample and the dependency information, and it is difficult to mine the potential knowledge information behind the samples of different classes. In the current object detection task, for a certain sample, the class label is either class a, class B, or class C, etc. However, this labeling method has no semantic information other than the labeling of the category information, and each category label is regarded as a base on an axis orthogonal to each other in the cartesian coordinate system, which means that the euclidean distance between each category is consistent, which is not very different from the meaning of the actual category in reality. That is, the model considers that a bolt is normal, a bolt is missing pin, and a bolt is missing nut are all equivalent categories, but obviously, the bolt is normal and the bolt is missing pin are very similar in characteristics, and the bolt is normal and the bolt is missing nut with obvious difference.
In addition, at present, the following two types of methods are not adopted to treat the long tail distribution problem: based on the way the samples are weighted and the way the different subsets of samples are balanced.
On one hand, the method based on resampling and weighting can only solve the problem that the head class occupies more gradient return compared with the tail class during gradient return, in nature, a defect sample has a typical long tail distribution characteristic, different defect classes have different numbers of examples, more importantly, the defects of the same class are not all in one image, and the problem of learning a small sample aiming at tail characteristics is not solved. On the other hand, in nature, the distribution of labels of different categories is usually associated with other categories, and has no characteristics of independent and same distribution, and the problem of difficult feature identification due to less tail sample data cannot be solved, so that a technical scheme is urgently needed to fundamentally solve the problem of unstable model detection capability caused by unbalanced samples by enhancing the tail feature extraction capability.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for detecting the bolt defects by comprehensively utilizing a semi-supervised learning and priori knowledge embedding strategy, a plurality of related tasks are put together for cooperative training, so that parameters and data among all the tasks can be shared, and the generalization capability of a model is increased.
In order to achieve the technical purpose, the invention provides a bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy, which comprises the following steps:
acquiring bolt images of different parts, and establishing bolt detection data sets of different types of defects;
based on a priori knowledge embedding strategy, performing feature processing on the bolt detection data set to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing that the features between the samples have correlation;
constructing a variational self-encoder network model, wherein the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit;
based on the variational self-encoder network model, the sample data set is input into the variational self-encoder network model for training in a semi-supervised learning mode, and a bolt defect detection model is constructed and used for identifying the bolt defect type of the bolt image to be detected.
Preferably, in the process of acquiring bolt images of different parts, the bolt images include a normal bolt image, a shim-missing bolt image, a pin-missing bolt image and a nut-missing bolt image.
Preferably, in the process of constructing the bolt defect detection model, based on the variational self-encoder network model, the characteristics of the sample data set are extracted through a CNN backbone module to obtain image characteristic information;
converting the image characteristic information into a characteristic sequence through an encoder module, and integrating position information coding to obtain a hidden space characteristic vector;
under the guidance of prior knowledge, converting the hidden space feature vector into a plurality of intermediate features, and decomposing the intermediate features into target coordinates and classification labels through a feedforward neural network FNN;
labeling the sample according to the target coordinates and the classification labels to generate labeled data;
and training the variational self-encoder network model according to the marked data and the unmarked data to construct a bolt defect detection model.
Preferably, in the process of constructing the bolt defect detection model, constructing the bolt defect detection model further comprises a first decoder module and a second decoder module;
the first decoder module is used for integrating to obtain sample correlation characteristics according to the position information codes and the intermediate characteristics, wherein the sample correlation characteristics comprise position characteristics and category specific characteristic information;
the second decoder module is used for removing the position characteristics through deconvolution operation and obtaining category specific characteristic information, and the category specific characteristic information is used for obtaining the characteristic matching degree of the intermediate characteristics and the corresponding category specific characteristic information;
preferably, in the process of training the variational self-encoder network model, a training set is generated according to marked data and unmarked data;
inputting the training set into a variational self-encoder network model, carrying out forward propagation once, and then training the model through a back propagation algorithm to obtain a prediction category, a boundary prediction value, a real category label and a boundary.
Preferably, in the process of training the model through a back propagation algorithm, model training is carried out through labeled data to obtain classification loss and bounding box regression loss;
the equation for the classification loss is expressed as:
Figure BDA0003591362520000061
the equation expression for the bounding box regression loss is:
Figure BDA0003591362520000062
pithe probability of the ith anchor prediction as a true label,
Figure BDA0003591362520000063
1 when positive samples and 0 when negative samples, tiShows the bounding box regression parameters for predicting the ith anchor,
Figure BDA0003591362520000064
Represents the real frame regression parameter corresponding to the ith anchor, R is LOSSCIOUA loss function.
Preferably, the LOSS is performed during training of the model by a back propagation algorithmCIOUThe equation for the loss function is expressed as:
Figure BDA0003591362520000065
Figure BDA0003591362520000066
Figure BDA0003591362520000067
the IOU is the ratio of the intersection and union of the predicted frame and the real frame, bgtThe center points, ρ, of the predicted and true bounding box, respectively2(b,bgt) The Euclidean distance representing the center point of the predicted frame and the real frame, the diagonal distance of the minimum closed-form area which can simultaneously contain the predicted frame and the real frame, alpha is a parameter for balancing proportion, and v is used for measuring the proportion consistency between the anchor frame and the predicted frame.
Preferably, in the process of training the model by the back propagation algorithm, the model training is performed by limiting the distance between each convolutional layer output feature using unlabeled data, wherein the model training process is constrained by the feature matching loss between the original image and the corresponding synthetic image:
Figure BDA0003591362520000071
wherein N isclsRepresents the number of samples, N, in a batchregIndicates the number of anchors.
Preferably, in the process of model training using unlabeled data, the feature information Q corresponding to the original image is used as an input, and the information including the position feature and the category-specific feature is obtained through the first decoder module. And the second decoder module integrates two characteristics of the first decoder module by adopting deconvolution to obtain characteristic information. From the duality, it can be seen that: the obtained characteristic information is used as input, and the position characteristic and the category specific characteristic information can be obtained through the first decoder module and the second decoder module. Aiming at the problem that the unlabeled data sample cannot directly calculate classification and bounding box regression loss, the invention designs the following loss by utilizing duality of feature extraction of the unlabeled data:
Figure BDA0003591362520000072
where T represents the number of layers of features extracted in the discriminator, DiRepresenting the extracted feature, NiRepresenting the number of features extracted by the i-th layer arbiter network.
Preferably, the bolt defect detecting system for implementing the bolt defect detecting method includes:
the data acquisition module is used for acquiring bolt images of different parts and establishing bolt detection data sets of different types of defects;
the data processing module is used for carrying out feature processing on the bolt detection data set based on a priori knowledge embedding strategy to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing that the features between the samples have correlation;
the defect identification module is used for constructing a variational self-encoder network model, and the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit; based on the variational self-encoder network model, the sample data set is input into the variational self-encoder network model for training in a semi-supervised learning mode, and a bolt defect detection model is constructed and used for identifying the bolt defect type of the bolt image to be detected.
The invention discloses the following technical effects:
the invention provides a bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy. On one hand, a unique variational self-encoder network is designed, a target detection problem is converted into a group of dual problems of image transformation, dual relations between the dual problems are used as constraints, two learning models are trained at the same time, and the performances of the two models are mutually promoted. The model can finally reach or approach the performance result of the supervised learning model under the condition of low-proportion labeled data, so that the labeling cost of the data is greatly reduced;
on the other hand, the correlation and the dependency of different types of samples are fully utilized to improve the accuracy of bolt defect detection under long tail distribution.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart illustrating a method for detecting a defect of a bolt according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a variational self-encoder according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating an association relationship between samples according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a sample of an embodiment of the invention;
FIG. 5 is a comparison graph of the detection effects of different models according to the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIGS. 1-5, the invention provides 1. a bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy, comprising the following steps:
acquiring bolt images of different parts, and establishing bolt detection data sets of different types of defects;
based on a priori knowledge embedding strategy, carrying out feature processing on a bolt detection data set to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing the correlation of features between samples;
constructing a variational self-encoder network model, wherein the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit;
based on the variational self-encoder network model, the sample data set is input into the variational self-encoder network model for training in a semi-supervised learning mode, and a bolt defect detection model is constructed and used for identifying the bolt defect type of the bolt image to be detected.
Further preferably, in the process of acquiring bolt images of different parts, the bolt images include a normal bolt image, a shim-missing bolt image, a pin-missing bolt image and a nut-missing bolt image.
Further preferably, in the process of constructing the bolt defect detection model, based on the variational self-encoder network model, feature extraction is carried out on the sample data set through a CNN backbone module to obtain image feature information;
converting the image characteristic information into a characteristic sequence through an encoder module, and integrating position information coding to obtain a hidden space characteristic vector;
under the guidance of prior knowledge, converting the hidden space feature vector into a plurality of intermediate features, and decomposing the intermediate features into target coordinates and classification labels through a feedforward neural network FNN;
labeling the sample according to the target coordinates and the classification labels to generate labeled data;
and training the variational self-encoder network model according to the marked data and the unmarked data to construct a bolt defect detection model.
Further preferably, in the process of constructing the bolt defect detection model, constructing the bolt defect detection model further includes a first decoder module and a second decoder module;
the first decoder module is used for integrating to obtain sample correlation characteristics according to the position information codes and the intermediate characteristics, wherein the sample correlation characteristics comprise position characteristics and category specific characteristic information;
the second decoder module is used for removing the position characteristics through deconvolution operation and obtaining category specific characteristic information, and the category specific characteristic information is used for obtaining the characteristic matching degree of the intermediate characteristics and the corresponding category specific characteristic information.
The first decoder module is used for integrating to obtain the characteristic information H according to the position information code and the intermediate characteristic Q (the intermediate characteristic Q refers to the characteristic obtained by the original image through the characteristic extraction network). The feature information H includes the position feature L and the category-specific feature information X, Y, Z and W.
The second decoder module integrates the characteristic information H obtained by the first decoder module, and removes position coding information through deconvolution operation to obtain characteristic information R.
The feature information R refers to category feature information of the image (i.e., the feature information H is removed of the position information). The purpose of the feature information R is to calculate the degree of feature matching between the original image intermediate features Q and the corresponding composite image features R.
Further preferably, in the process of training the variational self-encoder network model, a training set is generated according to labeled data and unlabeled data;
inputting the training set into a variational self-encoder network model, carrying out forward propagation once, and then training the model through a back propagation algorithm to obtain a prediction category, a boundary prediction value, a real category label and a boundary.
Further preferably, in the process of training the model through a back propagation algorithm, model training is carried out through labeled data, and classification loss and boundary box regression loss are obtained;
the equation for the classification loss is expressed as:
Figure BDA0003591362520000111
the equation expression for the bounding box regression loss is:
Figure BDA0003591362520000121
pithe probability of the ith anchor prediction as a true label,
Figure BDA0003591362520000122
1 when positive samples and 0 when negative samples, tiShows the bounding box regression parameters for predicting the ith anchor,
Figure BDA0003591362520000123
Represents the real frame regression parameter corresponding to the ith anchor, R is LOSSCIOUA loss function.
Further preferably, the LOSS is performed during training of the model by a back propagation algorithmCIOUThe equation for the loss function is expressed as:
Figure BDA0003591362520000124
Figure BDA0003591362520000125
Figure BDA0003591362520000126
the IOU is the ratio of the intersection and union of the predicted frame and the real frame, bgtThe center points, ρ, of the predicted and true bounding box, respectively2(b,bgt) The Euclidean distance representing the center point of the predicted frame and the real frame, the diagonal distance of the minimum closed-form area which can simultaneously contain the predicted frame and the real frame, alpha is a parameter for balancing proportion, and v is used for measuring the proportion consistency between the anchor frame and the predicted frame.
Further preferably, the unlabeled data is used for model training by defining the distance between output features of each convolutional layer during model training by the back propagation algorithm; and constraining the model training process by adopting the characteristic matching loss between the original image and the corresponding synthetic image:
Figure BDA0003591362520000131
wherein N isclsRepresents the number of samples, N, in a batchregIndicates the number of anchors.
Further preferably, in the process of model training using unlabeled data, the process of model training is represented as:
Figure BDA0003591362520000132
where T represents the number of layers of features extracted in the discriminator, DiRepresenting the extracted feature, NiFeatures representing layer i arbiter network extractionThe number of the cells.
Further preferably, a bolt defect detecting system for implementing the bolt defect detecting method includes:
the data acquisition module is used for acquiring bolt images of different parts and establishing bolt detection data sets of different types of defects;
the data processing module is used for carrying out feature processing on the bolt detection data set based on a priori knowledge embedding strategy to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing that the features between the samples have correlation;
the defect identification module is used for constructing a variational self-encoder network model, and the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit; based on the variational self-encoder network model, a sample data set is input into the variational self-encoder network model for training in a semi-supervised learning mode, and a bolt defect detection model is constructed and used for identifying the bolt defect type of the bolt image to be detected.
The invention provides a bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy, which is an overall flow chart of an embodiment of the invention and comprises the following steps of:
step 1: and establishing bolt detection data sets of different types of defects. The method comprises the steps of patrolling the power transmission line through an unmanned aerial vehicle, collecting bolt images on various components, wherein the bolt images are divided into 4 types including bolt-normal (ls-zc), bolt-gap gasket (ls-qdp), bolt-gap pin (ls-qxd) and bolt-gap nut (ls-qlm), 1970 pieces are summed, and schematic diagrams of various samples are shown in the following figures.
Step 2: the priori knowledge embedding strategy is characterized in that the marked samples are utilized to effectively learn, capture and fuse the correlation and dependency between samples of different categories and between categories in a natural scene, graph embedding vectors are used for replacing original one-hot codes to represent information of different categories, semantic information contained in category labels is increased, and reasoning capability between samples of different categories is improved. The advantage of the graph embedding vector is that: when one sample type is judged and the characteristics of a single sample are not determined independently, the characteristics of the related samples can be aggregated and then supplemented to the characteristics of the current sample, so that the discrimination capability of the single sample is improved. The graph embedding is to map the graph model to a low-dimensional vector space, and the vector form of the graph model should be represented to reserve the structural information and potential characteristics of the graph model as much as possible. The sample-to-sample association relationship is shown in fig. 3, wherein a node IDx represents bolt samples of different classes, a connecting line between nodes represents that the characteristics of the samples have correlation, and the correlation between different samples can be transmitted through edges in the graph.
And step 3: and constructing a variational self-encoder network model. The structure of the variational self-encoder is schematically shown in figure 2, wherein Sync BN represents batch normalization, GCN represents graph convolution neural network, and Conv represents convolution.
Specifically, firstly, feature extraction is performed on the image through a CNN background module to obtain feature information of the image. And secondly, converting the extracted features into a 1D sequence by an encoder module, and simultaneously fusing position information coding to obtain a hidden space feature vector. The first decoder module then converts the hidden spatial feature vector into N intermediate features under the direction of a priori knowledge. And then, the second decoder module converts the N intermediate features output by the first decoder module into feature vectors of corresponding images under the guidance of position information coding. And finally, decoding the N intermediate features output by the first decoder module into target coordinates and classification labels through a feedforward neural network FNN, and establishing a joint training model. The process can be carried out in a semi-supervised learning mode, so that the dependence on the labeled data is reduced.
And 4, step 4: and performing collaborative training by using the labeled data and the unlabeled data, wherein the collaborative training is to convert the image in a certain domain into another domain by using a small amount of labeled samples and a large amount of unlabeled samples through a dual optimization strategy, and simultaneously, the converted image can be converted back to the original domain.
Dividing the bolt obtained in the step 1 into a training set and a verification set; and (3) for the labeled samples in the training set, in each epoch in the training process, inputting the same group of labeled samples into the variational self-encoder network built in the step (3) for forward propagation once, and then training the model through a back propagation algorithm. And calculating classification loss through a formula (2) and calculating regression loss of a boundary box through a formula (3) according to the obtained prediction category, the boundary prediction value, the real category label and the boundary:
Figure BDA0003591362520000161
Figure BDA0003591362520000162
Figure BDA0003591362520000163
in the formula (4), piThe probability of the ith anchor prediction as a true label,
Figure BDA0003591362520000164
1 when positive samples, 0 and t when negative samplesiShows the bounding box regression parameters for predicting the ith anchor,
Figure BDA0003591362520000165
Represents the regression parameter, N, of the "true frame" corresponding to the ith anchorclsRepresents the number of samples, N, in a batchregThe number of anchors (not the number of anchors) and R is LOSSCIOUA loss function. Among them, LOSSCIOUThe specific calculation method of (2) is shown in formula (5).
Figure BDA0003591362520000166
Figure BDA0003591362520000167
Figure BDA0003591362520000168
In formula (5), IOU is the ratio of the intersection and union of the "predicted frame" and the "real frame", bgtThe center points, ρ, of the "predicted bounding box" and the "true bounding box", respectively2(b,bgt) The Euclidean distance representing the central point of the 'predicted frame' and the 'real frame', the diagonal distance representing the minimum closure area which can contain the 'predicted frame' and the 'real frame', alpha is a parameter for balancing proportion, and v is used for measuring the proportion consistency between the anchor frame (anchor) and the predicted frame.
And (3) for the label-free samples of the training set, under the condition that the classification labels are not available, obtaining a synthetic image corresponding to the original image by subjecting the same image to the variational self-coder network model defined in the step (3). And then training the bolt detection model through a back propagation algorithm, wherein the training process of the bolt detection model is constrained by adopting the characteristic matching loss between the original image and the corresponding synthetic image. Specifically, a multi-layer discriminator is used to extract features from the original image and the composite image, and then the L1 distance between each layer of convolution output features is calculated:
Figure BDA0003591362520000171
where T represents the number of layers of features extracted in the discriminator, DiRepresenting the extracted feature, NiRepresenting the number of features extracted by the i-th layer arbiter network.
And finally, verifying the training effect of the bolt defect detection model by using the test set to obtain the trained bolt detection model.
And 5: and acquiring a bolt image to be detected, inputting the bolt image to the trained bolt defect detection model, and detecting the defect condition of the bolt, wherein the defect condition is the defect condition label with the highest reliability in the feature vector representing the defect condition of the bolt output by the bolt defect detection model and the information of the boundary box.
The method and the comparison experiment method provided by the invention have the advantages that images in bolt defect data sets are obtained by shooting by an unmanned aerial vehicle in the actual inspection process, and the sample data is marked by referring to the marking scheme of the power transmission line equipment of the national power grid. Each image in the labeled sample has a corresponding XML label file, and the XML file contains the name of the image, the category of the target and the coordinate information of the frame. The data set comprises 4 types of samples of bolt-normal (ls-zc), bolt-absent gasket (ls-qdp), bolt-absent pin (ls-qxd) and bolt-absent nut (ls-qlm), 1970 total pictures are obtained, and the schematic diagram and the quantity distribution of the samples are shown in FIG. 4 and Table 1.
TABLE 1
Figure BDA0003591362520000181
The invention adopts Average Precision (mAP) and inference speed (FPS) as the evaluation indexes of the Precision and processing speed of target detection. The mAP is the average value of all kinds of precision and is an index for measuring the overall precision of the target detection model. FPS represents the number of pictures which can be processed per second, and the processing speed of the algorithm can be effectively measured.
The bolt defect detection method based on the semi-supervised learning and the priori knowledge embedding strategy provided by the invention is compared with the experimental results of the Faster R-CNN and the YOLOv5 models. The data labeled on all labeled bolt data sets by the fast-RCNN and the YOLOv5 and the bolt defect detection method based on the semi-supervised learning and the prior knowledge embedding strategy, which are provided by the invention, use 40% of the data as labeled data and 60% of the data as unlabeled data. The training Epoch is set to 100 and the Batch size to 8. In the training process, the learning rate of the first 10 epochs is gradually increased from 10 < -6 > to 10 < -3 >, the learning rate of 10 < -19 > epochs is 10 < -2 >, and the learning rate of 20 < -100 > epochs is reduced to 10 < -4 >. In order to prevent the model from falling into the local optimum point, an SGD optimizer is used, the momentum coefficient is set to be 0.9, the attenuation coefficient is set to be 0.005, and the comparison results of the detection performances of different models are shown in Table 2.
TABLE 2
Figure BDA0003591362520000191
The result is shown in Table 2, when the threshold value of the IOU is taken as 0.5, compared with the fast R-CNN model, the mAP of the method provided by the invention is improved by 2.8%, the FPS is improved by 3.3, and the detection accuracy and the detection speed are improved; compared with a YOLOv5 model, the method provided by the invention improves the detection precision of the bolt defects of the power transmission line, but has a certain promotion space in the detection speed.
As can be seen from the comparison between FIG. 5(a) and FIGS. 5(b) and 5(c), the confidence of the Faster R-CNN and the Yolov5 models for the detection of small objects is lower than that of the method of the present invention, and there is a missing detection situation in Faster R-CNN. In contrast, the method provided by the invention can effectively detect the bolt target on the pendant of the power transmission line and can avoid the false detection of the small target. The experimental result shows that under the condition of unbalanced data samples, the method provided by the invention can effectively improve the detection performance of the model, and meanwhile, the method provided by the application only uses 60% of the final detection result of all the data sets to approach the model using all the data sets during initialization. The method provided by the invention is proved to be capable of effectively reducing the number of the marked pictures and further reducing the marking cost of the pictures to a certain extent.
While the invention has been described in detail in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (10)

1. The bolt defect detection method based on the semi-supervised learning and the prior knowledge embedding strategy is characterized by comprising the following steps of:
acquiring bolt images of different parts, and establishing bolt detection data sets of different types of defects;
based on a priori knowledge embedding strategy, carrying out feature processing on the bolt detection data set to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing the correlation of features between samples;
constructing a variational self-encoder network model, wherein the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit;
and inputting the sample data set into the variational self-encoder network model for training in a semi-supervised learning mode based on the variational self-encoder network model, and constructing a bolt defect detection model, wherein the bolt defect detection model is used for identifying the bolt defect type of the bolt image to be detected.
2. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 1, characterized in that:
in the process of acquiring bolt images of different parts, the bolt images comprise normal bolt images, gasket-lacking bolt images, pin-lacking bolt images and nut-lacking bolt images.
3. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 2, characterized in that:
in the process of constructing a bolt defect detection model, based on the variational self-encoder network model, performing feature extraction on the sample data set through a CNN backbone module to obtain image feature information;
converting the image characteristic information into a characteristic sequence through an encoder module, and fusing position information coding to obtain a hidden space characteristic vector;
under the guidance of prior knowledge, converting the hidden space feature vector into a plurality of intermediate features, and decomposing the intermediate features into target coordinates and classification labels through a Feedforward Neural Network (FNN);
labeling the sample according to the target coordinates and the classification labels to generate labeled data;
and training the variational self-encoder network model according to the marked data and the unmarked data to construct the bolt defect detection model.
4. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 3, characterized in that:
in the process of constructing the bolt defect detection model, the constructing of the bolt defect detection model further comprises a first decoder module and a second decoder module;
the first decoder module is used for integrating the sample correlation characteristics according to the position information codes and the intermediate characteristics, wherein the sample correlation characteristics comprise position characteristics and category specific characteristic information;
the second decoder module is used for removing the position characteristics through deconvolution operation to obtain the category specific characteristic information, and the category specific characteristic information is used for obtaining the characteristic matching degree of the intermediate characteristics and the corresponding category specific characteristic information.
5. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 3, characterized in that:
generating a training set according to the marked data and the unmarked data in the process of training the variational self-coder network model;
and inputting the training set into the variational self-encoder network model, carrying out forward propagation once, and then training the model through a back propagation algorithm to obtain a prediction category, a boundary prediction value, a real category label and a boundary.
6. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 5, characterized in that:
in the process of training a model through a back propagation algorithm, performing model training through the labeled data to obtain classification loss and bounding box regression loss;
the equation expression of the classification loss is as follows:
Figure FDA0003591362510000031
the equation expression of the regression loss of the bounding box is as follows:
Figure FDA0003591362510000032
pithe probability of the ith anchor prediction as a true label,
Figure FDA0003591362510000033
1 when positive samples and 0 when negative samples, tiShows the bounding box regression parameters for predicting the ith anchor,
Figure FDA0003591362510000034
Represents the real frame regression parameter corresponding to the ith anchor, and R is LOSSCIOUA loss function.
7. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 6, characterized in that:
LOSS during model training by back propagation algorithmCIOUThe equation for the loss function is expressed as:
Figure FDA0003591362510000035
Figure FDA0003591362510000036
Figure FDA0003591362510000037
the IOU is the ratio of the intersection and union of the predicted frame and the real frame, bgtThe center points, ρ, of the predicted and true bounding box, respectively2(b,bgt) The Euclidean distance representing the center point of the predicted frame and the real frame, the diagonal distance of the minimum closed-form area which can simultaneously contain the predicted frame and the real frame, alpha is a parameter for balancing proportion, and v is used for measuring the proportion consistency between the anchor frame and the predicted frame.
8. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 7, characterized in that:
in the process of training the model through a back propagation algorithm, performing model training by limiting the distance between the output features of each convolutional layer and using the unlabeled data, wherein the model training process is constrained by the feature matching loss between the original image and the corresponding synthetic image:
Figure FDA0003591362510000041
wherein N isclsRepresents the number of samples, N, in a batchregIndicates the number of anchors.
9. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 8, characterized in that:
in the process of model training by using the unlabeled data, the process of model training is represented as follows:
Figure FDA0003591362510000042
where T represents the number of layers of features extracted in the discriminator, DiRepresenting the extracted feature, NiRepresenting the number of features extracted by the i-th layer arbiter network.
10. The bolt defect detection method based on semi-supervised learning and a priori knowledge embedding strategy according to claim 9, characterized in that:
the bolt defect detection system for realizing the bolt defect detection method comprises the following steps:
the data acquisition module is used for acquiring bolt images of different parts and establishing bolt detection data sets of different types of defects;
the data processing module is used for carrying out feature processing on the bolt detection data set based on a priori knowledge embedding strategy to generate a sample data set with sample correlation features, wherein the sample correlation features are used for representing that the features between the samples have correlation;
the defect identification module is used for constructing a variational self-encoder network model, and the variational self-encoder network model consists of a batch normalization unit, a graph convolution neural network unit and a convolution unit; and inputting the sample data set into the variational self-encoder network model for training in a semi-supervised learning mode based on the variational self-encoder network model, and constructing a bolt defect detection model, wherein the bolt defect detection model is used for identifying the bolt defect type of the bolt image to be detected.
CN202210378734.8A 2022-04-12 2022-04-12 Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy Pending CN114708518A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210378734.8A CN114708518A (en) 2022-04-12 2022-04-12 Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210378734.8A CN114708518A (en) 2022-04-12 2022-04-12 Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy

Publications (1)

Publication Number Publication Date
CN114708518A true CN114708518A (en) 2022-07-05

Family

ID=82175268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210378734.8A Pending CN114708518A (en) 2022-04-12 2022-04-12 Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy

Country Status (1)

Country Link
CN (1) CN114708518A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116824271A (en) * 2023-08-02 2023-09-29 上海互觉科技有限公司 SMT chip defect detection system and method based on tri-modal vector space alignment
CN117420809A (en) * 2023-12-18 2024-01-19 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence
CN117523565A (en) * 2023-11-13 2024-02-06 拓元(广州)智慧科技有限公司 Tail class sample labeling method, device, electronic equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116824271A (en) * 2023-08-02 2023-09-29 上海互觉科技有限公司 SMT chip defect detection system and method based on tri-modal vector space alignment
CN116824271B (en) * 2023-08-02 2024-02-09 上海互觉科技有限公司 SMT chip defect detection system and method based on tri-modal vector space alignment
CN117523565A (en) * 2023-11-13 2024-02-06 拓元(广州)智慧科技有限公司 Tail class sample labeling method, device, electronic equipment and storage medium
CN117420809A (en) * 2023-12-18 2024-01-19 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence
CN117420809B (en) * 2023-12-18 2024-03-01 台山市南特金属科技有限公司 Crankshaft machining optimization decision method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN111368690B (en) Deep learning-based video image ship detection method and system under influence of sea waves
Lin et al. RSCM: Region selection and concurrency model for multi-class weather recognition
CN114708518A (en) Bolt defect detection method based on semi-supervised learning and priori knowledge embedding strategy
CN105574550A (en) Vehicle identification method and device
CN111652293B (en) Vehicle weight recognition method for multi-task joint discrimination learning
CN110969166A (en) Small target identification method and system in inspection scene
CN110633632A (en) Weak supervision combined target detection and semantic segmentation method based on loop guidance
CN111862065B (en) Power transmission line diagnosis method and system based on multitask deep convolutional neural network
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN111199238A (en) Behavior identification method and equipment based on double-current convolutional neural network
CN115659966A (en) Rumor detection method and system based on dynamic heteromorphic graph and multi-level attention
CN113343123B (en) Training method and detection method for generating confrontation multiple relation graph network
CN114332288B (en) Method for generating text generation image of confrontation network based on phrase drive and network
Vijayaraju Image retrieval using image captioning
CN115620083A (en) Model training method, face image quality evaluation method, device and medium
CN114510610A (en) Visual concept recognition method for multi-modal knowledge graph construction
Sugang et al. Object detection algorithm based on cosine similarity IoU
Nag et al. CNN based approach for post disaster damage assessment
Knapik et al. Evaluation of deep learning strategies for underwater object search
al Atrash et al. Detecting and Counting People's Faces in Images Using Convolutional Neural Networks
CN113407439B (en) Detection method for software self-recognition type technical liabilities
CN113838130B (en) Weak supervision target positioning method based on feature expansibility learning
CN115272814B (en) Long-distance space self-adaptive multi-scale small target detection method
Zhao et al. Image data analytics to support engineers’ decision-making
Patel et al. A Comprehensive Analysis of Object Detectors in Adverse Weather Conditions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination