CN112837281A - Pin defect identification method, device and equipment based on cascade convolutional neural network - Google Patents

Pin defect identification method, device and equipment based on cascade convolutional neural network Download PDF

Info

Publication number
CN112837281A
CN112837281A CN202110109172.2A CN202110109172A CN112837281A CN 112837281 A CN112837281 A CN 112837281A CN 202110109172 A CN202110109172 A CN 202110109172A CN 112837281 A CN112837281 A CN 112837281A
Authority
CN
China
Prior art keywords
training
sample
image
model
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110109172.2A
Other languages
Chinese (zh)
Other versions
CN112837281B (en
Inventor
肖业伟
李志强
郭雪峰
陈志豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN202110109172.2A priority Critical patent/CN112837281B/en
Publication of CN112837281A publication Critical patent/CN112837281A/en
Application granted granted Critical
Publication of CN112837281B publication Critical patent/CN112837281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a pin defect identification method, a pin defect identification device and pin defect identification equipment based on a cascade convolution neural network. Wherein the method comprises the following steps: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed; training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result. Compared with the traditional identification method, the method has the advantages that the identification speed and the identification precision are greatly improved, and the model is transplanted and applied to the mobile equipment more favorably.

Description

Pin defect identification method, device and equipment based on cascade convolutional neural network
Technical Field
The application relates to the technical field of pin defect identification of power transmission lines, in particular to a pin defect identification method, device and equipment based on a cascaded convolutional neural network.
Background
The electric power fitting is used as an essential metal element component in an electric power system, plays roles of fixing, supporting, protecting, connecting and the like of the electric power component in an overhead transmission line, and plays an indispensable role in stable, safe and reliable operation of the electric power system. However, most electric power fittings are not only exposed to a severe outdoor environment, but also bear mechanical tension loads of other external electric power equipment and loads generated by electric power transmission of an internal electric power system for a long time, and under the action of the internal and external loads, pins on the electric power fittings are easy to lose, and the operation failure of the electric power system is caused.
In recent years, unmanned aerial vehicle patrols and examines in the middle of the daily patrol and examine of wide application in transmission line, has reduced the work load of power grid operation and maintenance personnel climbing inspection and has reduced the danger coefficient of work, and the high-efficient accurate fault condition who judges power equipment that just can. Due to the perfection of an electric power system and the large increase of power transmission equipment, aerial images grow explosively, and the traditional manual identification detection efficiency is low. Therefore, the detection method of pin defect identification is faced with a serious challenge.
In the existing target detection research method, the defect detection of the pin in the aerial power transmission line image with complex background and large scale is difficult to realize. There are several technical difficulties: false detection and missing detection caused by the characteristics of blurred images, complex image background, small target, multi-form appearance, partial shielding of the detected target and the like during detection; in the convolutional neural network, the high-precision network generally has deeper layers, so that the generated calculation and storage expenses are extremely high; a large amount of labeled data is needed in the training of a deeper network structure, the training is complex, the efficiency is low, and higher detection precision is difficult to achieve under the condition of lacking a data set; and fourthly, because the target detection algorithm is designed aiming at the conventional image, the size of the image is fixed during training and detection, and the universality and the scale adaptability cannot be realized in the target detection of the aerial photography power transmission line image.
Disclosure of Invention
In view of the above, the present application aims to overcome the defects of the prior art, and provides a pin defect identification method, apparatus and device based on a concatenated convolutional neural network.
The above object of the present application is achieved by the following technical solutions:
in a first aspect, an embodiment of the present application provides a pin defect identification method based on a concatenated convolutional neural network, including:
improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
Optionally, the improved convolutional layer structure includes:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
Figure BDA0002918623800000031
wherein n represents an n-th layer,
Figure BDA0002918623800000032
representing offset, (i, j) representing position index of image pixel, xi,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.
Optionally, the fusing the multi-scale feature map includes:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
Optionally, the improvement loss function includes:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
Figure BDA0002918623800000033
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
Optionally, the improved training strategy includes:
dividing the training process into two steps of pre-training and off-line difficult sample training;
before training, dividing an image sample into a positive sample, a negative sample and a part of positive samples by means of the size of an intersection ratio IOU;
in the pre-training stage, a strategy of on-line difficult sample mining is adopted, namely in the training process, the propagation losses generated by calculating each batch of data are sequenced, and the samples with the largest propagation losses are divided into difficult samples according to a certain proportion; updating weights in the neural network model only with the loss of difficult samples when backward propagation is performed;
in an off-line difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training PNet and improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
Optionally, the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio includes:
selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size;
and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
Optionally, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
Optionally, in the cascade detection process, the redundancy of each level of candidate frames is reduced by using a non-maximum suppression algorithm NMS.
In a second aspect, an embodiment of the present application further provides a pin defect identification apparatus based on a concatenated convolutional neural network, including:
an improvement module for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module is used for training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module is used for utilizing the training model to identify the pin defects of the image to be identified: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
In a third aspect, an embodiment of the present application further provides a pin defect identification device based on a cascaded convolutional neural network, including:
the processor is connected with the memory;
the memory for storing a program for implementing at least the method according to any of the first aspects;
the processor is used for calling and executing the program stored in the memory.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
according to the technical scheme provided by the embodiment of the application, firstly, when a large-scale image is processed, the aerial image is quickly traversed and subjected to target search by means of small-scale shallow layer full convolution, and then the candidate target processed by the previous single-scale shallow layer full convolution is subjected to cascade classification and accurate positioning by means of deeper convolution. In addition, in the shallow layer full convolution neural network, aerial images with any scale can be input, and the detection speed is high; the cascade detection mechanism composed of a plurality of convolutional neural networks has stronger advantage in precision than that of a single neural network. On the basis, the structure of the convolutional layer is improved, the multi-scale characteristic graphs are fused, and an angle variable is added into a classified cross entropy loss function; in the training stage, a multi-task learning and offline difficult sample mining strategy is utilized, so that the effect of a training model can be effectively improved. Therefore, the identification speed and the identification precision are greatly improved, and the model is more advantageous to be transplanted and applied to the mobile equipment.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a model training process provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a concatenated convolutional neural network according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
Examples
Referring to fig. 1, fig. 1 is a schematic flowchart of a pin defect identification method based on a concatenated convolutional neural network according to an embodiment of the present application. As shown in fig. 1, the method comprises at least the following steps:
s101: improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
specifically, the MTCNN (Multi-task convolutional neural network) algorithm uses three cascaded networks and adopts the idea of candidate frame plus classifier for detection. The candidate network PNet is a shallow full convolution for fast generating the candidate window, the optimized network RNet is a deeper convolution for high-precision candidate window filtering selection, and the output network ONet is a deeper convolution for generating the final bounding box.
In some embodiments, the improved convolutional layer structure comprises:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
Figure BDA0002918623800000071
wherein n represents an n-th layer,
Figure BDA0002918623800000072
representing offset, (i, j) representing position index of image pixel, xi,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.
Specifically, because the traditional convolution has insufficient capability of extracting highly nonlinear features, a nonlinear multilayer perceptron is used for realizing more complex operation on neurons of each local receptive field. Therefore, the method not only can obtain the characteristics of stronger generalization ability and higher abstraction, but also can realize the dimension reduction of the data.
And after the network structure is modified (a nonlinear multilayer perceptron is added), the target detection precision is improved, the complexity of a network model is increased, and generally, the detection speed of the model is reduced. To solve this problem to some extent, the convolution kernel in the original network is decomposed, which can significantly reduce the amount of computation and reduce the model to some extent.
To further improve the expressive power of the network, the first layer of 3 × 3 convolution operations of PNet, RNet, ONet in the MTCNN cascaded convolutional neural network performs aggregated re-convolution on features over multiple scales. In addition, removing the fully-connected layer, i.e., replacing the fully-connected layer in the conventional CNN with global pooling, can reduce network parameters and reduce the possibility of overfitting.
In some embodiments, the fusing the multi-scale feature maps includes:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
Specifically, the fusion of multi-scale feature maps, that is, the idea of combining the detection results of different layers to improve the detection performance, adopts the respective prediction of the multi-scale features, and then fuses the results generated by the prediction. In order to improve the performance of pin defect identification, such as resolution, position information and the like, the improvement of feature map fusion is added in the MTCNN cascade convolution neural network.
In some embodiments, the improvement loss function comprises:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
Figure BDA0002918623800000081
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
Specifically, in the MTCNN model, a classification task is performed by means of a cross entropy loss function. Since the function uses a plane as the classification plane only, too small an inter-class distance results in poor classification of samples near the classification plane, i.e., a good loss function has a small enough intra-class distance and a large enough inter-class distance. Based on this, the above-described improvement is made to the loss function. After improvement, the inter-class distance can be effectively increased, so that the degree of the decision area is improved, and the angle distribution in the class is compressed.
In some embodiments, the improved training strategy comprises:
the training process is divided into two steps of pre-training and off-line difficult sample training, so that the generalization performance of the model is enhanced to the maximum extent under the condition of limited labeling data. The specific training process will be explained in the following steps.
S102: constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
among other things, with respect to preprocessing, the objective is to facilitate subsequent training. Specifically, the original image is subjected to data augmentation processing such as illumination, mirror image, color, environment transformation and the like to obtain a plurality of (for example, 6000) aerial images, and then the aerial images are labeled by using a labeling tool; further, to expand the number and diversity of training data, training samples are classified into 4 categories: positive samples, negative samples, partial positive samples, and difficult samples.
With respect to the sample type, before training, the image sample is divided into a positive sample, a negative sample, and a part of positive samples by the size of an intersection over union (iou). The IOU is a concept for evaluating positioning accuracy in a target detection task, and is specifically defined as a ratio of intersection and union of areas of a prediction frame and a labeling frame, and the formula is as follows:
Figure BDA0002918623800000091
specifically, one possible partitioning method is: selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size; and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
In addition, in order to ensure the class balance of normal pin targets and pin defect targets in the training samples, 70% of aerial image training sets are selected as the training samples, and the rest 30% of aerial image training sets are selected as the testing sets, so that the generalization and the practicability of the model are tested.
S103: training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model; the cascade convolutional neural network comprises a three-layer network structure of a candidate network PNet, an optimized network RNet and an output network ONet;
in order to enhance the generalization performance of the model to the maximum extent under the condition of limited labeling data, the training process is divided into two steps, namely pre-training and off-line difficult sample training.
Referring to fig. 2 in the training process, specifically, in the pre-training stage, a strategy of Online Hard sample Mining (OHEM) is adopted, that is, in the training process, the propagation losses generated by calculating each batch of data are sorted, and the samples with the largest propagation loss are divided into the difficult samples according to a certain proportion; in the backward propagation, the weights in the neural network model are updated only with the loss of difficult samples. The MTCNN model which is pre-trained is used for detecting the training images, so that a large number of false detection targets and missed detection targets can be obtained, and the targets are difficult samples. The OHEM strategy ignores the gradient of part of samples which are easy to be classified when the convolutional neural network is reversely transmitted, most of the selected samples are simple samples, the existence of a large number of simple samples causes that the model cannot efficiently distinguish the difficult samples, which is a main factor causing false detection and missed detection, and the false detection rate and the missed detection rate of the detection result can be improved by adding the difficult samples into the training data.
In addition, in an offline difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training the PNet and the improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
S104: and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
Specifically, the detection process of performing pin defect recognition on the image to be recognized by using the training model is cascade detection, which includes:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame; the PNet is a sliding window method which is accelerated by a GPU, the PNet judges the probability that a window belongs to a pin target while sliding on each level of image, and therefore a probability map of the level of image is obtained;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
In addition, in the cascade detection process, a large number of overlapping frames are generated in the candidate frame generation process, so in some embodiments, the non-maximum suppression algorithm NMS is used to reduce redundancy of candidate frames at each level.
By the method, accurate detection results can be obtained.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
according to the technical scheme provided by the embodiment of the application, firstly, when a large-scale image is processed, the aerial image is quickly traversed and subjected to target search by means of small-scale shallow layer full convolution, and then the candidate target processed by the previous single-scale shallow layer full convolution is subjected to cascade classification and accurate positioning by means of deeper convolution. In addition, in the shallow layer full convolution neural network, aerial images with any scale can be input, and the detection speed is high; the cascade detection mechanism composed of a plurality of convolutional neural networks has stronger advantage in precision than that of a single neural network. On the basis, the structure of the convolutional layer is improved, the multi-scale characteristic graphs are fused, and an angle variable is added into a classified cross entropy loss function; in the training stage, a multi-task learning and offline difficult sample mining strategy is utilized, so that the effect of a training model can be effectively improved. Therefore, the identification speed and the identification precision are greatly improved, and the model is more advantageous to be transplanted and applied to the mobile equipment.
In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides a pin defect identification device based on the cascaded convolutional neural network. The device is a functional module based on software, hardware or a combination thereof in equipment for executing the method.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a pin defect identification apparatus based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 3, the apparatus includes at least:
an improvement module 21 for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module 22 for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module 23 is configured to train the cascaded convolutional neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module 24 is configured to perform pin defect identification on the image to be identified by using the training model: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
For a specific implementation method of the steps executed by each module in the apparatus, reference may be made to the foregoing method embodiment, which is not described herein again.
In addition, corresponding to the pin defect identification method based on the cascaded convolutional neural network provided by the embodiment, the embodiment of the application also provides pin defect identification equipment based on the cascaded convolutional neural network. The device is an intelligent device, such as a PC or a mobile intelligent device, which applies the method described above.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a pin defect identification device based on a cascaded convolutional neural network according to an embodiment of the present application. As shown in fig. 4, the apparatus includes at least:
a memory 31 and a processor 32 connected to the memory 31;
the memory 31 is used for storing a program, and the program is at least used for implementing the pin defect identification method based on the cascaded convolutional neural network described in the above method embodiment;
the processor 32 is used to call and execute the program stored in the memory 31.
For specific implementation methods of each step of the method implemented by the program, reference may be made to the foregoing method embodiments, and details are not described here again.
In the scheme, the recognition speed and the recognition accuracy are greatly improved, and the model is more advantageous to be transplanted and applied on the mobile device.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. A pin defect identification method based on a cascade convolution neural network is characterized by comprising the following steps:
improving an algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
and utilizing the training model to perform pin defect recognition on the image to be recognized: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
2. The method of claim 1, wherein the modified convolutional layer structure comprises:
adding a nonlinear multilayer sensor after partial convolution layers, decomposing convolution kernels in original convolution and removing full connection layers at the same time; wherein, the calculation formula of each characteristic diagram in the nonlinear multilayer perceptron is as follows:
Figure FDA0002918623790000011
wherein n represents an n-th layer,
Figure FDA0002918623790000012
representing offset, (i, j) representing position index of image pixel, xi,jRepresenting the input picture block centered at position (i, j), and k represents the index of the feature map to be extracted.
3. The method of claim 1, wherein the fusing the multi-scale feature maps comprises:
performing three-layer convolution on a detection window of 12 multiplied by 12 pixels in the PNet, then performing loss calculation, and performing feature map fusion on a first layer and a third layer of the PNet;
feature map fusion is performed on the second and third layers of RNet and ONet based on the same principle.
4. The method of claim 1, wherein the improvement loss function comprises:
the loss is functionalized by an angle variable into a formula about the angle, and an integer N is introduced to enlarge the angle problem, and the improved loss function is:
Figure FDA0002918623790000021
in the formula, L represents a loss function, θ is an angle between the classification plane and W, and W is a weight vector of the neuron.
5. The method of claim 1, wherein the improved training strategy comprises:
dividing the training process into two steps of pre-training and off-line difficult sample training;
before training, dividing an image sample into a positive sample, a negative sample and a part of positive samples by means of the size of an intersection ratio IOU;
in the pre-training stage, a strategy of on-line difficult sample mining is adopted, namely in the training process, the propagation losses generated by calculating each batch of data are sequenced, and the samples with the largest propagation losses are divided into difficult samples according to a certain proportion; updating weights in the neural network model only with the loss of difficult samples when backward propagation is performed;
in an off-line difficult sample training stage, aiming at network models of input images with different scales, scaling the obtained positive samples, negative samples and partial positive samples into 12 × 12 pixels and 24 × 24 pixels, and respectively training PNet and improved RNet; finally, the acquired difficult samples, positive samples and partial positive samples are scaled to 48 × 48 pixels to retrain the improved ONet.
6. The method of claim 5, wherein the dividing the image sample into the positive sample, the negative sample, and the partial positive sample by the size of the cross-over ratio comprises:
selecting a training sample by a sliding window method, namely performing pyramid processing on a training image, and performing region selection on the image by using a sliding window with a set size;
and calculating the selected area and the IOU of the labeling box, and recording the area with the IOU larger than 0.7 as a positive sample, recording the area with the IOU smaller than 0.3 as a negative sample, and recording the area between 0.5 and 0.7 as a partial positive sample.
7. The method according to claim 1, wherein the detection process of pin defect recognition on the image to be recognized by using the training model is cascade detection, which comprises:
carrying out pyramid processing on the input image, and detecting each level of processed image by means of PNet to obtain a preliminary candidate frame;
mapping the candidate frame obtained from each level of image to an original image to obtain a target slice, and classifying and performing boundary regression on the primary candidate frame by means of the RNet to obtain a secondary candidate frame;
and classifying and performing boundary regression on the secondary candidate frames reaching the set threshold value again by using ONet to obtain a detection result.
8. The method according to claim 7, characterized in that during the cascade detection, a non-maximum suppression algorithm NMS is used to reduce the redundancy of the candidate boxes at each level.
9. A pin defect identification device based on a cascade convolution neural network is characterized by comprising:
an improvement module for improving the algorithm model: improving based on an original MTCNN algorithm, and constructing an improved MTCNN algorithm model; wherein the improvement comprises: improving the convolutional layer structure, fusing the multi-scale characteristic diagram, improving the loss function and improving the training strategy; the MTCNN algorithm comprises a candidate network PNet, an optimized network RNet and an output network ONet three-layer network structure;
a construction module for constructing a training data set: the method comprises the steps that an electric power fitting image sample at a connection position in a power transmission line is obtained through shooting by an unmanned aerial vehicle in different environments, preprocessing is carried out on the electric power fitting image sample, and a sample training set is constructed;
the training module is used for training the cascade convolution neural network by using the sample training set based on the improved MTCNN algorithm model to obtain a training model;
the identification module is used for utilizing the training model to identify the pin defects of the image to be identified: and inputting the electric power fitting image acquired by the unmanned aerial vehicle inspection into the training model to obtain a pin state recognition result.
10. A pin defect identification device based on a cascaded convolutional neural network, comprising:
the processor is connected with the memory;
the memory for storing a program for implementing at least the method of any one of claims 1-8;
the processor is used for calling and executing the program stored in the memory.
CN202110109172.2A 2021-01-27 2021-01-27 Pin defect identification method, device and equipment based on cascade convolution neural network Active CN112837281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110109172.2A CN112837281B (en) 2021-01-27 2021-01-27 Pin defect identification method, device and equipment based on cascade convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110109172.2A CN112837281B (en) 2021-01-27 2021-01-27 Pin defect identification method, device and equipment based on cascade convolution neural network

Publications (2)

Publication Number Publication Date
CN112837281A true CN112837281A (en) 2021-05-25
CN112837281B CN112837281B (en) 2022-10-28

Family

ID=75932057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110109172.2A Active CN112837281B (en) 2021-01-27 2021-01-27 Pin defect identification method, device and equipment based on cascade convolution neural network

Country Status (1)

Country Link
CN (1) CN112837281B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469951A (en) * 2021-06-08 2021-10-01 燕山大学 Hub defect detection method based on cascade region convolutional neural network
CN113538387A (en) * 2021-07-23 2021-10-22 广东电网有限责任公司 Multi-scale inspection image identification method and device based on deep convolutional neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148079A1 (en) * 2014-11-21 2016-05-26 Adobe Systems Incorporated Object detection using cascaded convolutional neural networks
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net
CN109145854A (en) * 2018-08-31 2019-01-04 东南大学 A kind of method for detecting human face based on concatenated convolutional neural network structure
CN110210354A (en) * 2019-05-23 2019-09-06 南京邮电大学 A kind of detection of haze weather traffic mark with know method for distinguishing
US20200226421A1 (en) * 2019-01-15 2020-07-16 Naver Corporation Training and using a convolutional neural network for person re-identification
CN111650204A (en) * 2020-05-11 2020-09-11 安徽继远软件有限公司 Transmission line hardware defect detection method and system based on cascade target detection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148079A1 (en) * 2014-11-21 2016-05-26 Adobe Systems Incorporated Object detection using cascaded convolutional neural networks
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net
CN109145854A (en) * 2018-08-31 2019-01-04 东南大学 A kind of method for detecting human face based on concatenated convolutional neural network structure
US20200226421A1 (en) * 2019-01-15 2020-07-16 Naver Corporation Training and using a convolutional neural network for person re-identification
CN110210354A (en) * 2019-05-23 2019-09-06 南京邮电大学 A kind of detection of haze weather traffic mark with know method for distinguishing
CN111650204A (en) * 2020-05-11 2020-09-11 安徽继远软件有限公司 Transmission line hardware defect detection method and system based on cascade target detection

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469951A (en) * 2021-06-08 2021-10-01 燕山大学 Hub defect detection method based on cascade region convolutional neural network
CN113538387A (en) * 2021-07-23 2021-10-22 广东电网有限责任公司 Multi-scale inspection image identification method and device based on deep convolutional neural network
CN113538387B (en) * 2021-07-23 2024-04-05 广东电网有限责任公司 Multi-scale inspection image identification method and device based on deep convolutional neural network

Also Published As

Publication number Publication date
CN112837281B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
CN110503112B (en) Small target detection and identification method for enhancing feature learning
CN110827251B (en) Power transmission line locking pin defect detection method based on aerial image
CN111767882B (en) Multi-mode pedestrian detection method based on improved YOLO model
CN114627360B (en) Substation equipment defect identification method based on cascade detection model
CN114240878A (en) Routing inspection scene-oriented insulator defect detection neural network construction and optimization method
CN111967480A (en) Multi-scale self-attention target detection method based on weight sharing
CN114758288B (en) Power distribution network engineering safety control detection method and device
CN114972213A (en) Two-stage mainboard image defect detection and positioning method based on machine vision
CN111582092B (en) Pedestrian abnormal behavior detection method based on human skeleton
CN112434723B (en) Day/night image classification and object detection method based on attention network
CN113052834A (en) Pipeline defect detection method based on convolution neural network multi-scale features
CN112837281B (en) Pin defect identification method, device and equipment based on cascade convolution neural network
CN114463759A (en) Lightweight character detection method and device based on anchor-frame-free algorithm
CN111223087B (en) Automatic bridge crack detection method based on generation countermeasure network
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN113096085A (en) Container surface damage detection method based on two-stage convolutional neural network
CN116152658A (en) Forest fire smoke detection method based on domain countermeasure feature fusion network
CN117934375A (en) Lightweight lithium battery surface defect detection method for enhancing image feature fusion
CN115240259A (en) Face detection method and face detection system based on YOLO deep network in classroom environment
CN116895030A (en) Insulator detection method based on target detection algorithm and attention mechanism
CN113887455B (en) Face mask detection system and method based on improved FCOS
CN113012107B (en) Power grid defect detection method and system
Cao et al. A spatial pyramid pooling convolutional neural network for smoky vehicle detection
CN117113066A (en) Transmission line insulator defect detection method based on computer vision
CN117392568A (en) Method for unmanned aerial vehicle inspection of power transformation equipment in complex scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant