CN112837281A - Pin defect identification method, device and equipment based on cascade convolutional neural network - Google Patents
Pin defect identification method, device and equipment based on cascade convolutional neural network Download PDFInfo
- Publication number
- CN112837281A CN112837281A CN202110109172.2A CN202110109172A CN112837281A CN 112837281 A CN112837281 A CN 112837281A CN 202110109172 A CN202110109172 A CN 202110109172A CN 112837281 A CN112837281 A CN 112837281A
- Authority
- CN
- China
- Prior art keywords
- training
- sample
- image
- model
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000007547 defect Effects 0.000 title claims abstract description 40
- 238000013527 convolutional neural network Methods 0.000 title claims description 28
- 238000012549 training Methods 0.000 claims abstract description 114
- 238000013528 artificial neural network Methods 0.000 claims abstract description 14
- 230000005540 biological transmission Effects 0.000 claims abstract description 14
- 238000007689 inspection Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000001514 detection method Methods 0.000 claims description 44
- 230000006870 function Effects 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 18
- 230000006872 improvement Effects 0.000 claims description 15
- 238000010586 diagram Methods 0.000 claims description 14
- 230000004927 fusion Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 7
- 238000005065 mining Methods 0.000 claims description 5
- 210000002569 neuron Anatomy 0.000 claims description 4
- 238000010276 construction Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000001629 suppression Effects 0.000 claims description 3
- 230000008901 benefit Effects 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009194 climbing Effects 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Quality & Reliability (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The application relates to a pin defect identification method, a pin defect identification device and pin defect identification equipment based on a cascade convolution neural network. Wherein the method comprises the following steps: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed; training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result. Compared with the traditional identification method, the method has the advantages that the identification speed and the identification precision are greatly improved, and the model is transplanted and applied to the mobile equipment more favorably.
Description
Technical Field
The application relates to the technical field of pin defect identification of power transmission lines, in particular to a pin defect identification method, device and equipment based on a cascaded convolutional neural network.
Background
The electric power fitting is used as an essential metal element component in an electric power system, plays roles of fixing, supporting, protecting, connecting and the like of the electric power component in an overhead transmission line, and plays an indispensable role in stable, safe and reliable operation of the electric power system. However, most electric power fittings are not only exposed to a severe outdoor environment, but also bear mechanical tension loads of other external electric power equipment and loads generated by electric power transmission of an internal electric power system for a long time, and under the action of the internal and external loads, pins on the electric power fittings are easy to lose, and the operation failure of the electric power system is caused.
In recent years, unmanned aerial vehicle patrols and examines in the middle of the daily patrol and examine of wide application in transmission line, has reduced the work load of power grid operation and maintenance personnel climbing inspection and has reduced the danger coefficient of work, and the high-efficient accurate fault condition who judges power equipment that just can. Due to the perfection of an electric power system and the large increase of power transmission equipment, aerial images grow explosively, and the traditional manual identification detection efficiency is low. Therefore, the detection method of pin defect identification is faced with a serious challenge.
In the existing target detection research method, the defect detection of the pin in the aerial power transmission line image with complex background and large scale is difficult to realize. There are several technical difficulties: false detection and missing detection caused by the characteristics of blurred images, complex image background, small target, multi-form appearance, partial shielding of the detected target and the like during detection; in the convolutional neural network, the high-precision network generally has deeper layers, so that the generated calculation and storage expenses are extremely high; a large amount of labeled data is needed in the training of a deeper network structure, the training is complex, the efficiency is low, and higher detection precision is difficult to achieve under the condition of lacking a data set; and fourthly, because the target detection algorithm is designed aiming at the conventional image, the size of the image is fixed during training and detection, and the universality and the scale adaptability cannot be realized in the target detection of the aerial photography power transmission line image.
Disclosure of Invention
In view of the above, the present application aims to overcome the defects of the prior art, and provides a pin defect identification method, apparatus and device based on a concatenated convolutional neural network.
The above object of the present application is achieved by the following technical solutions:
in a first aspect, an embodiment of the present application provides a pin defect identification method based on a concatenated convolutional neural network, including:
improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
Optionally, the improved convolutional layer structure includes:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
wherein n represents an n-th layer,representing offset, (i, j) representing position index of image pixel, xi,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.
Optionally, the fusing the multi-scale feature map includes:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
Optionally, the improvement loss function includes:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
Optionally, the improved training strategy includes:
dividing the training process into two steps of pre-training and off-line difficult sample training;
before training, dividing an image sample into a positive sample, a negative sample and a part of positive samples by means of the size of an intersection ratio IOU;
in the pre-training stage, a strategy of on-line difficult sample mining is adopted, namely in the training process, the propagation losses generated by calculating each batch of data are sequenced, and the samples with the largest propagation losses are divided into difficult samples according to a certain proportion; updating weights in the neural network model only with the loss of difficult samples when backward propagation is performed;
in an off-line difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training PNet and improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
Optionally, the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio includes:
selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size;
and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
Optionally, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
Optionally, in the cascade detection process, the redundancy of each level of candidate frames is reduced by using a non-maximum suppression algorithm NMS.
In a second aspect, an embodiment of the present application further provides a pin defect identification apparatus based on a concatenated convolutional neural network, including:
an improvement module for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module is used for training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module is used for utilizing the training model to identify the pin defects of the image to be identified: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
In a third aspect, an embodiment of the present application further provides a pin defect identification device based on a cascaded convolutional neural network, including:
the processor is connected with the memory;
the memory for storing a program for implementing at least the method according to any of the first aspects;
the processor is used for calling and executing the program stored in the memory.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
according to the technical scheme provided by the embodiment of the application, firstly, when a large-scale image is processed, the aerial image is quickly traversed and subjected to target search by means of small-scale shallow layer full convolution, and then the candidate target processed by the previous single-scale shallow layer full convolution is subjected to cascade classification and accurate positioning by means of deeper convolution. In addition, in the shallow layer full convolution neural network, aerial images with any scale can be input, and the detection speed is high; the cascade detection mechanism composed of a plurality of convolutional neural networks has stronger advantage in precision than that of a single neural network. On the basis, the structure of the convolutional layer is improved, the multi-scale characteristic graphs are fused, and an angle variable is added into a classified cross entropy loss function; in the training stage, a multi-task learning and offline difficult sample mining strategy is utilized, so that the effect of a training model can be effectively improved. Therefore, the identification speed and the identification precision are greatly improved, and the model is more advantageous to be transplanted and applied to the mobile equipment.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a model training process provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a concatenated convolutional neural network according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
Examples
Referring to fig. 1, fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application. As shown in fig. 1, the method comprises at least the following steps:
s101: improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
specifically, the MTCNN (Multi-task convolutional neural network) algorithm uses three cascaded networks and adopts the idea of candidate frame plus classifier for detection. The candidate network PNet is a shallow full convolution for fast generating the candidate window, the optimized network RNet is a deeper convolution for high-precision candidate window filtering selection, and the output network ONet is a deeper convolution for generating the final bounding box.
In some embodiments, the improved convolutional layer structure comprises:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
wherein n represents an n-th layer,representing offset, (i, j) representing position index of image pixel, xi,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.
Specifically, because the traditional convolution has insufficient capability of extracting highly nonlinear features, a nonlinear multilayer perceptron is used for realizing more complex operation on neurons of each local receptive field. Therefore, the method not only can obtain the characteristics of stronger generalization ability and higher abstraction, but also can realize the dimension reduction of the data.
And after the network structure is modified (a nonlinear multilayer perceptron is added), the target detection precision is improved, the complexity of a network model is increased, and generally, the detection speed of the model is reduced. To solve this problem to some extent, the convolution kernel in the original network is decomposed, which can significantly reduce the amount of computation and reduce the model to some extent.
To further improve the expressive power of the network, the first layer of 3 × 3 convolution operations of PNet, RNet, ONet in the MTCNN cascaded convolutional neural network performs aggregated re-convolution on features over multiple scales. In addition, removing the fully-connected layer, i.e., replacing the fully-connected layer in the conventional CNN with global pooling, can reduce network parameters and reduce the possibility of overfitting.
In some embodiments, the fusing the multi-scale feature maps includes:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
Specifically, the fusion of multi-scale feature maps, that is, the idea of combining the detection results of different layers to improve the detection performance, adopts the respective prediction of the multi-scale features, and then fuses the results generated by the prediction. In order to improve the performance of pin defect identification, such as resolution, position information and the like, the improvement of feature map fusion is added in the MTCNN cascade convolution neural network.
In some embodiments, the improvement loss function comprises:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
Specifically, in the MTCNN model, a classification task is performed by means of a cross entropy loss function. Since the function uses a plane as the classification plane only, too small an inter-class distance results in poor classification of samples near the classification plane, i.e., a good loss function has a small enough intra-class distance and a large enough inter-class distance. Based on this, the above-described improvement is made to the loss function. After improvement, the inter-class distance can be effectively increased, so that the degree of the decision area is improved, and the angle distribution in the class is compressed.
In some embodiments, the improved training strategy comprises:
the training process is divided into two steps of pre-training and off-line difficult sample training, so that the generalization performance of the model is enhanced to the maximum extent under the condition of limited labeling data. The specific training process will be explained in the following steps.
S102: constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
among other things, with respect to preprocessing, the objective is to facilitate subsequent training. Specifically, the original image is subjected to data augmentation processing such as illumination, mirror image, color, environment transformation and the like to obtain a plurality of (for example, 6000) aerial images, and then the aerial images are labeled by using a labeling tool; further, to expand the number and diversity of training data, training samples are classified into 4 categories: positive samples, negative samples, partial positive samples, and difficult samples.
With respect to the sample type, before training, the image sample is divided into a positive sample, a negative sample, and a part of positive samples by the size of an intersection over union (iou). The IOU is a concept for evaluating positioning accuracy in a target detection task, and is specifically defined as a ratio of intersection and union of areas of a prediction frame and a labeling frame, and the formula is as follows:
specifically, one possible partitioning method is: selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size; and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
In addition, in order to ensure the class balance of normal pin targets and pin defect targets in the training samples, 70% of aerial image training sets are selected as the training samples, and the rest 30% of aerial image training sets are selected as the testing sets, so that the generalization and the practicability of the model are tested.
S103: training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; the cascade convolutional neural network comprises a three-layer network structure of a candidate network PNet, an optimized network RNet and an output network ONet;
in order to enhance the generalization performance of the model to the maximum extent under the condition of limited labeling data, the training process is divided into two steps, namely pre-training and off-line difficult sample training.
Referring to fig. 2 in the training process, specifically, in the pre-training stage, a strategy of Online Hard sample Mining (OHEM) is adopted, that is, in the training process, the propagation losses generated by calculating each batch of data are sorted, and the samples with the largest propagation loss are divided into the difficult samples according to a certain proportion; in the backward propagation, the weights in the neural network model are updated only with the loss of difficult samples. The MTCNN model which is pre-trained is used for detecting the training images, so that a large number of false detection targets and missed detection targets can be obtained, and the targets are difficult samples. The OHEM strategy ignores the gradient of part of samples which are easy to be classified when the convolutional neural network is reversely transmitted, most of the selected samples are simple samples, the existence of a large number of simple samples causes that the model cannot efficiently distinguish the difficult samples, which is a main factor causing false detection and missed detection, and the false detection rate and the missed detection rate of the detection result can be improved by adding the difficult samples into the training data.
In addition, in an offline difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training the PNet and the improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
S104: and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
Specifically, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame; the PNet is a sliding window method which is accelerated by a GPU, the PNet judges the probability that a window belongs to a pin target while sliding on each level of image, and therefore a probability map of the level of image is obtained;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
In addition, in the cascade detection process, a large number of overlapping frames are generated in the candidate frame generation process, so in some embodiments, the non-maximum suppression algorithm NMS is used to reduce redundancy of candidate frames at each level.
By the method, accurate detection results can be obtained.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
according to the technical scheme provided by the embodiment of the application, firstly, when a large-scale image is processed, the aerial image is quickly traversed and subjected to target search by means of small-scale shallow layer full convolution, and then the candidate target processed by the previous single-scale shallow layer full convolution is subjected to cascade classification and accurate positioning by means of deeper convolution. In addition, in the shallow layer full convolution neural network, aerial images with any scale can be input, and the detection speed is high; the cascade detection mechanism composed of a plurality of convolutional neural networks has stronger advantage in precision than that of a single neural network. On the basis, the structure of the convolutional layer is improved, the multi-scale characteristic graphs are fused, and an angle variable is added into a classified cross entropy loss function; in the training stage, a multi-task learning and offline difficult sample mining strategy is utilized, so that the effect of a training model can be effectively improved. Therefore, the identification speed and the identification precision are greatly improved, and the model is more advantageous to be transplanted and applied to the mobile equipment.
In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides a pin defect identification device based on the cascaded convolutional neural network. The device is a functional module based on software, hardware or a combination thereof in equipment for executing the method.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 3, the apparatus includes at least:
an improvement module 21 for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module 22 for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module 23 is configured to train the cascaded convolutional neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module 24 is configured to perform pin defect identification on the image to be identified by using the training model: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
For a specific implementation method of the steps executed by each module in the apparatus, reference may be made to the foregoing method embodiment, which is not described herein again.
In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides pin defect identification equipment based on the cascaded convolutional neural network. The device is an intelligent device, such as a PC or a mobile intelligent device, which applies the method described above.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 4, the apparatus includes at least:
a memory 31 and a processor 32 connected to the memory 31;
the memory 31 is used for storing a program, and the program is at least used for implementing the pin defect identification method based on the cascaded convolutional neural network described in the above method embodiment;
the processor 32 is used to call and execute the program stored in the memory 31.
For specific implementation methods of each step of the method implemented by the program, reference may be made to the foregoing method embodiments, and details are not described here again.
In the scheme, the recognition speed and the recognition accuracy are greatly improved, and the model is more advantageous to be transplanted and applied on the mobile device.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.
Claims (10)
1. A pin defect identification method based on a cascade convolution neural network is characterized by comprising the following steps:
improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
2. The method of claim 1, wherein the modified convolutional layer structure comprises:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
3. The method of claim 1, wherein the fusing the multi-scale feature maps comprises:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
4. The method of claim 1, wherein the improvement loss function comprises:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
5. The method of claim 1, wherein the improved training strategy comprises:
dividing the training process into two steps of pre-training and off-line difficult sample training;
before training, dividing an image sample into a positive sample, a negative sample and a part of positive samples by means of the size of an intersection ratio IOU;
in the pre-training stage, a strategy of on-line difficult sample mining is adopted, namely in the training process, the propagation losses generated by calculating each batch of data are sequenced, and the samples with the largest propagation losses are divided into difficult samples according to a certain proportion; updating weights in the neural network model only with the loss of difficult samples when backward propagation is performed;
in an off-line difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training PNet and improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
6. The method of claim 5, wherein the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio comprises:
selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size;
and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
7. The method according to claim 1, wherein the detection process of pin defect recognition on the image to be recognized by using the training model is cascade detection, which comprises:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
8. The method according to claim 7, characterized in that during the cascade detection, a non-maximum suppression algorithm NMS is used to reduce the redundancy of the candidate boxes at each level.
9. A pin defect identification device based on a cascade convolution neural network is characterized by comprising:
an improvement module for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module is used for training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module is used for utilizing the training model to identify the pin defects of the image to be identified: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
10. A pin defect identification device based on a cascaded convolutional neural network, comprising:
the processor is connected with the memory;
the memory for storing a program for implementing at least the method of any one of claims 1-8;
the processor is used for calling and executing the program stored in the memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110109172.2A CN112837281B (en) | 2021-01-27 | 2021-01-27 | Pin defect identification method, device and equipment based on cascade convolution neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110109172.2A CN112837281B (en) | 2021-01-27 | 2021-01-27 | Pin defect identification method, device and equipment based on cascade convolution neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112837281A true CN112837281A (en) | 2021-05-25 |
CN112837281B CN112837281B (en) | 2022-10-28 |
Family
ID=75932057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110109172.2A Active CN112837281B (en) | 2021-01-27 | 2021-01-27 | Pin defect identification method, device and equipment based on cascade convolution neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112837281B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113469951A (en) * | 2021-06-08 | 2021-10-01 | 燕山大学 | Hub defect detection method based on cascade region convolutional neural network |
CN113538387A (en) * | 2021-07-23 | 2021-10-22 | 广东电网有限责任公司 | Multi-scale inspection image identification method and device based on deep convolutional neural network |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148079A1 (en) * | 2014-11-21 | 2016-05-26 | Adobe Systems Incorporated | Object detection using cascaded convolutional neural networks |
CN107748858A (en) * | 2017-06-15 | 2018-03-02 | 华南理工大学 | A kind of multi-pose eye locating method based on concatenated convolutional neutral net |
CN109145854A (en) * | 2018-08-31 | 2019-01-04 | 东南大学 | A kind of method for detecting human face based on concatenated convolutional neural network structure |
CN110210354A (en) * | 2019-05-23 | 2019-09-06 | 南京邮电大学 | A kind of detection of haze weather traffic mark with know method for distinguishing |
US20200226421A1 (en) * | 2019-01-15 | 2020-07-16 | Naver Corporation | Training and using a convolutional neural network for person re-identification |
CN111650204A (en) * | 2020-05-11 | 2020-09-11 | 安徽继远软件有限公司 | Transmission line hardware defect detection method and system based on cascade target detection |
-
2021
- 2021-01-27 CN CN202110109172.2A patent/CN112837281B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160148079A1 (en) * | 2014-11-21 | 2016-05-26 | Adobe Systems Incorporated | Object detection using cascaded convolutional neural networks |
CN107748858A (en) * | 2017-06-15 | 2018-03-02 | 华南理工大学 | A kind of multi-pose eye locating method based on concatenated convolutional neutral net |
CN109145854A (en) * | 2018-08-31 | 2019-01-04 | 东南大学 | A kind of method for detecting human face based on concatenated convolutional neural network structure |
US20200226421A1 (en) * | 2019-01-15 | 2020-07-16 | Naver Corporation | Training and using a convolutional neural network for person re-identification |
CN110210354A (en) * | 2019-05-23 | 2019-09-06 | 南京邮电大学 | A kind of detection of haze weather traffic mark with know method for distinguishing |
CN111650204A (en) * | 2020-05-11 | 2020-09-11 | 安徽继远软件有限公司 | Transmission line hardware defect detection method and system based on cascade target detection |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113469951A (en) * | 2021-06-08 | 2021-10-01 | 燕山大学 | Hub defect detection method based on cascade region convolutional neural network |
CN113538387A (en) * | 2021-07-23 | 2021-10-22 | 广东电网有限责任公司 | Multi-scale inspection image identification method and device based on deep convolutional neural network |
CN113538387B (en) * | 2021-07-23 | 2024-04-05 | 广东电网有限责任公司 | Multi-scale inspection image identification method and device based on deep convolutional neural network |
Also Published As
Publication number | Publication date |
---|---|
CN112837281B (en) | 2022-10-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110503112B (en) | Small target detection and identification method for enhancing feature learning | |
CN110827251B (en) | Power transmission line locking pin defect detection method based on aerial image | |
CN111767882B (en) | Multi-mode pedestrian detection method based on improved YOLO model | |
CN114627360B (en) | Substation equipment defect identification method based on cascade detection model | |
CN114240878A (en) | Routing inspection scene-oriented insulator defect detection neural network construction and optimization method | |
CN111967480A (en) | Multi-scale self-attention target detection method based on weight sharing | |
CN114758288B (en) | Power distribution network engineering safety control detection method and device | |
CN114972213A (en) | Two-stage mainboard image defect detection and positioning method based on machine vision | |
CN111582092B (en) | Pedestrian abnormal behavior detection method based on human skeleton | |
CN112434723B (en) | Day/night image classification and object detection method based on attention network | |
CN113052834A (en) | Pipeline defect detection method based on convolution neural network multi-scale features | |
CN112837281B (en) | Pin defect identification method, device and equipment based on cascade convolution neural network | |
CN114463759A (en) | Lightweight character detection method and device based on anchor-frame-free algorithm | |
CN111223087B (en) | Automatic bridge crack detection method based on generation countermeasure network | |
CN116385958A (en) | Edge intelligent detection method for power grid inspection and monitoring | |
CN113096085A (en) | Container surface damage detection method based on two-stage convolutional neural network | |
CN116152658A (en) | Forest fire smoke detection method based on domain countermeasure feature fusion network | |
CN117934375A (en) | Lightweight lithium battery surface defect detection method for enhancing image feature fusion | |
CN115240259A (en) | Face detection method and face detection system based on YOLO deep network in classroom environment | |
CN116895030A (en) | Insulator detection method based on target detection algorithm and attention mechanism | |
CN113887455B (en) | Face mask detection system and method based on improved FCOS | |
CN113012107B (en) | Power grid defect detection method and system | |
Cao et al. | A spatial pyramid pooling convolutional neural network for smoky vehicle detection | |
CN117113066A (en) | Transmission line insulator defect detection method based on computer vision | |
CN117392568A (en) | Method for unmanned aerial vehicle inspection of power transformation equipment in complex scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |