CN112837281A

CN112837281A - Pin defect identification method, device and equipment based on cascade convolutional neural network

Info

Publication number: CN112837281A
Application number: CN202110109172.2A
Authority: CN
Inventors: 肖业伟; 李志强; 郭雪峰; 陈志豪
Original assignee: Xiangtan University
Current assignee: Xiangtan University
Priority date: 2021-01-27
Filing date: 2021-01-27
Publication date: 2021-05-25
Anticipated expiration: 2041-01-27
Also published as: CN112837281B

Abstract

The application relates to a pin defect identification method, a pin defect identification device and pin defect identification equipment based on a cascade convolution neural network. Wherein the method comprises the following steps: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed; training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result. Compared with the traditional identification method, the method has the advantages that the identification speed and the identification precision are greatly improved, and the model is transplanted and applied to the mobile equipment more favorably.

Description

Pin defect identification method, device and equipment based on cascade convolutional neural network

Technical Field

The application relates to the technical field of pin defect identification of power transmission lines, in particular to a pin defect identification method, device and equipment based on a cascaded convolutional neural network.

Background

The electric power fitting is used as an essential metal element component in an electric power system, plays roles of fixing, supporting, protecting, connecting and the like of the electric power component in an overhead transmission line, and plays an indispensable role in stable, safe and reliable operation of the electric power system. However, most electric power fittings are not only exposed to a severe outdoor environment, but also bear mechanical tension loads of other external electric power equipment and loads generated by electric power transmission of an internal electric power system for a long time, and under the action of the internal and external loads, pins on the electric power fittings are easy to lose, and the operation failure of the electric power system is caused.

In recent years, unmanned aerial vehicle patrols and examines in the middle of the daily patrol and examine of wide application in transmission line, has reduced the work load of power grid operation and maintenance personnel climbing inspection and has reduced the danger coefficient of work, and the high-efficient accurate fault condition who judges power equipment that just can. Due to the perfection of an electric power system and the large increase of power transmission equipment, aerial images grow explosively, and the traditional manual identification detection efficiency is low. Therefore, the detection method of pin defect identification is faced with a serious challenge.

In the existing target detection research method, the defect detection of the pin in the aerial power transmission line image with complex background and large scale is difficult to realize. There are several technical difficulties: false detection and missing detection caused by the characteristics of blurred images, complex image background, small target, multi-form appearance, partial shielding of the detected target and the like during detection; in the convolutional neural network, the high-precision network generally has deeper layers, so that the generated calculation and storage expenses are extremely high; a large amount of labeled data is needed in the training of a deeper network structure, the training is complex, the efficiency is low, and higher detection precision is difficult to achieve under the condition of lacking a data set; and fourthly, because the target detection algorithm is designed aiming at the conventional image, the size of the image is fixed during training and detection, and the universality and the scale adaptability cannot be realized in the target detection of the aerial photography power transmission line image.

Disclosure of Invention

In view of the above, the present application aims to overcome the defects of the prior art, and provides a pin defect identification method, apparatus and device based on a concatenated convolutional neural network.

The above object of the present application is achieved by the following technical solutions:

in a first aspect, an embodiment of the present application provides a pin defect identification method based on a concatenated convolutional neural network, including:

improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;

constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;

training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;

and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.

Optionally, the improved convolutional layer structure includes:

adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:

wherein n represents an n-th layer,

representing offset, (i, j) representing position index of image pixel, x_i,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.

Optionally, the fusing the multi-scale feature map includes:

performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;

feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.

Optionally, the improvement loss function includes:

the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:

in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.

Optionally, the improved training strategy includes:

dividing the training process into two steps of pre-training and off-line difficult sample training;

before training, dividing an image sample into a positive sample, a negative sample and a part of positive samples by means of the size of an intersection ratio IOU;

in the pre-training stage, a strategy of on-line difficult sample mining is adopted, namely in the training process, the propagation losses generated by calculating each batch of data are sequenced, and the samples with the largest propagation losses are divided into difficult samples according to a certain proportion; updating weights in the neural network model only with the loss of difficult samples when backward propagation is performed;

in an off-line difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training PNet and improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.

Optionally, the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio includes:

selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size;

and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.

Optionally, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:

carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame;

mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;

and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.

Optionally, in the cascade detection process, the redundancy of each level of candidate frames is reduced by using a non-maximum suppression algorithm NMS.

In a second aspect, an embodiment of the present application further provides a pin defect identification apparatus based on a concatenated convolutional neural network, including:

an improvement module for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;

a construction module for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;

the training module is used for training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;

the identification module is used for utilizing the training model to identify the pin defects of the image to be identified: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.

In a third aspect, an embodiment of the present application further provides a pin defect identification device based on a cascaded convolutional neural network, including:

the processor is connected with the memory;

the memory for storing a program for implementing at least the method according to any of the first aspects;

the processor is used for calling and executing the program stored in the memory.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

according to the technical scheme provided by the embodiment of the application, firstly, when a large-scale image is processed, the aerial image is quickly traversed and subjected to target search by means of small-scale shallow layer full convolution, and then the candidate target processed by the previous single-scale shallow layer full convolution is subjected to cascade classification and accurate positioning by means of deeper convolution. In addition, in the shallow layer full convolution neural network, aerial images with any scale can be input, and the detection speed is high; the cascade detection mechanism composed of a plurality of convolutional neural networks has stronger advantage in precision than that of a single neural network. On the basis, the structure of the convolutional layer is improved, the multi-scale characteristic graphs are fused, and an angle variable is added into a classified cross entropy loss function; in the training stage, a multi-task learning and offline difficult sample mining strategy is utilized, so that the effect of a training model can be effectively improved. Therefore, the identification speed and the identification precision are greatly improved, and the model is more advantageous to be transplanted and applied to the mobile equipment.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a model training process provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a concatenated convolutional neural network according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

Examples

Referring to fig. 1, fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application. As shown in fig. 1, the method comprises at least the following steps:

s101: improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;

specifically, the MTCNN (Multi-task convolutional neural network) algorithm uses three cascaded networks and adopts the idea of candidate frame plus classifier for detection. The candidate network PNet is a shallow full convolution for fast generating the candidate window, the optimized network RNet is a deeper convolution for high-precision candidate window filtering selection, and the output network ONet is a deeper convolution for generating the final bounding box.

In some embodiments, the improved convolutional layer structure comprises:

wherein n represents an n-th layer,

Specifically, because the traditional convolution has insufficient capability of extracting highly nonlinear features, a nonlinear multilayer perceptron is used for realizing more complex operation on neurons of each local receptive field. Therefore, the method not only can obtain the characteristics of stronger generalization ability and higher abstraction, but also can realize the dimension reduction of the data.

And after the network structure is modified (a nonlinear multilayer perceptron is added), the target detection precision is improved, the complexity of a network model is increased, and generally, the detection speed of the model is reduced. To solve this problem to some extent, the convolution kernel in the original network is decomposed, which can significantly reduce the amount of computation and reduce the model to some extent.

To further improve the expressive power of the network, the first layer of 3 × 3 convolution operations of PNet, RNet, ONet in the MTCNN cascaded convolutional neural network performs aggregated re-convolution on features over multiple scales. In addition, removing the fully-connected layer, i.e., replacing the fully-connected layer in the conventional CNN with global pooling, can reduce network parameters and reduce the possibility of overfitting.

In some embodiments, the fusing the multi-scale feature maps includes:

Specifically, the fusion of multi-scale feature maps, that is, the idea of combining the detection results of different layers to improve the detection performance, adopts the respective prediction of the multi-scale features, and then fuses the results generated by the prediction. In order to improve the performance of pin defect identification, such as resolution, position information and the like, the improvement of feature map fusion is added in the MTCNN cascade convolution neural network.

In some embodiments, the improvement loss function comprises:

Specifically, in the MTCNN model, a classification task is performed by means of a cross entropy loss function. Since the function uses a plane as the classification plane only, too small an inter-class distance results in poor classification of samples near the classification plane, i.e., a good loss function has a small enough intra-class distance and a large enough inter-class distance. Based on this, the above-described improvement is made to the loss function. After improvement, the inter-class distance can be effectively increased, so that the degree of the decision area is improved, and the angle distribution in the class is compressed.

In some embodiments, the improved training strategy comprises:

the training process is divided into two steps of pre-training and off-line difficult sample training, so that the generalization performance of the model is enhanced to the maximum extent under the condition of limited labeling data. The specific training process will be explained in the following steps.

S102: constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;

among other things, with respect to preprocessing, the objective is to facilitate subsequent training. Specifically, the original image is subjected to data augmentation processing such as illumination, mirror image, color, environment transformation and the like to obtain a plurality of (for example, 6000) aerial images, and then the aerial images are labeled by using a labeling tool; further, to expand the number and diversity of training data, training samples are classified into 4 categories: positive samples, negative samples, partial positive samples, and difficult samples.

With respect to the sample type, before training, the image sample is divided into a positive sample, a negative sample, and a part of positive samples by the size of an intersection over union (iou). The IOU is a concept for evaluating positioning accuracy in a target detection task, and is specifically defined as a ratio of intersection and union of areas of a prediction frame and a labeling frame, and the formula is as follows:

specifically, one possible partitioning method is: selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size; and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.

In addition, in order to ensure the class balance of normal pin targets and pin defect targets in the training samples, 70% of aerial image training sets are selected as the training samples, and the rest 30% of aerial image training sets are selected as the testing sets, so that the generalization and the practicability of the model are tested.

S103: training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; the cascade convolutional neural network comprises a three-layer network structure of a candidate network PNet, an optimized network RNet and an output network ONet;

in order to enhance the generalization performance of the model to the maximum extent under the condition of limited labeling data, the training process is divided into two steps, namely pre-training and off-line difficult sample training.

Referring to fig. 2 in the training process, specifically, in the pre-training stage, a strategy of Online Hard sample Mining (OHEM) is adopted, that is, in the training process, the propagation losses generated by calculating each batch of data are sorted, and the samples with the largest propagation loss are divided into the difficult samples according to a certain proportion; in the backward propagation, the weights in the neural network model are updated only with the loss of difficult samples. The MTCNN model which is pre-trained is used for detecting the training images, so that a large number of false detection targets and missed detection targets can be obtained, and the targets are difficult samples. The OHEM strategy ignores the gradient of part of samples which are easy to be classified when the convolutional neural network is reversely transmitted, most of the selected samples are simple samples, the existence of a large number of simple samples causes that the model cannot efficiently distinguish the difficult samples, which is a main factor causing false detection and missed detection, and the false detection rate and the missed detection rate of the detection result can be improved by adding the difficult samples into the training data.

In addition, in an offline difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training the PNet and the improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.

S104: and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.

Specifically, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:

carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame; the PNet is a sliding window method which is accelerated by a GPU, the PNet judges the probability that a window belongs to a pin target while sliding on each level of image, and therefore a probability map of the level of image is obtained;

In addition, in the cascade detection process, a large number of overlapping frames are generated in the candidate frame generation process, so in some embodiments, the non-maximum suppression algorithm NMS is used to reduce redundancy of candidate frames at each level.

By the method, accurate detection results can be obtained.

In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides a pin defect identification device based on the cascaded convolutional neural network. The device is a functional module based on software, hardware or a combination thereof in equipment for executing the method.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 3, the apparatus includes at least:

an improvement module 21 for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;

a construction module 22 for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;

the training module 23 is configured to train the cascaded convolutional neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;

the identification module 24 is configured to perform pin defect identification on the image to be identified by using the training model: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.

For a specific implementation method of the steps executed by each module in the apparatus, reference may be made to the foregoing method embodiment, which is not described herein again.

In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides pin defect identification equipment based on the cascaded convolutional neural network. The device is an intelligent device, such as a PC or a mobile intelligent device, which applies the method described above.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 4, the apparatus includes at least:

a memory 31 and a processor 32 connected to the memory 31;

the memory 31 is used for storing a program, and the program is at least used for implementing the pin defect identification method based on the cascaded convolutional neural network described in the above method embodiment;

the processor 32 is used to call and execute the program stored in the memory 31.

For specific implementation methods of each step of the method implemented by the program, reference may be made to the foregoing method embodiments, and details are not described here again.

In the scheme, the recognition speed and the recognition accuracy are greatly improved, and the model is more advantageous to be transplanted and applied on the mobile device.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A pin defect identification method based on a cascade convolution neural network is characterized by comprising the following steps:

2. The method of claim 1, wherein the modified convolutional layer structure comprises:

wherein n represents an n-th layer,

3. The method of claim 1, wherein the fusing the multi-scale feature maps comprises:

4. The method of claim 1, wherein the improvement loss function comprises:

5. The method of claim 1, wherein the improved training strategy comprises:

6. The method of claim 5, wherein the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio comprises:

7. The method according to claim 1, wherein the detection process of pin defect recognition on the image to be recognized by using the training model is cascade detection, which comprises:

8. The method according to claim 7, characterized in that during the cascade detection, a non-maximum suppression algorithm NMS is used to reduce the redundancy of the candidate boxes at each level.

9. A pin defect identification device based on a cascade convolution neural network is characterized by comprising:

10. A pin defect identification device based on a cascaded convolutional neural network, comprising:

the processor is connected with the memory;

the memory for storing a program for implementing at least the method of any one of claims 1-8;