CN108647655B - Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network - Google Patents

Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network Download PDF

Info

Publication number
CN108647655B
CN108647655B CN201810465955.2A CN201810465955A CN108647655B CN 108647655 B CN108647655 B CN 108647655B CN 201810465955 A CN201810465955 A CN 201810465955A CN 108647655 B CN108647655 B CN 108647655B
Authority
CN
China
Prior art keywords
power line
prediction
value
frame
boundary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810465955.2A
Other languages
Chinese (zh)
Other versions
CN108647655A (en
Inventor
张菁
王立元
卓力
梁西
李昱钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201810465955.2A priority Critical patent/CN108647655B/en
Publication of CN108647655A publication Critical patent/CN108647655A/en
Application granted granted Critical
Publication of CN108647655B publication Critical patent/CN108647655B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A low-altitude aerial image power line foreign matter detection method based on a light convolutional neural network belongs to the field of computer vision, and researches a real-time detection method for power line foreign matters in aerial images of unmanned aerial vehicles. Firstly, a light power line detection model is constructed by utilizing a convolutional neural network, and the depth characteristics of power lines in aerial images are obtained through calculation; then, a multi-target power line foreign matter detection model is built by utilizing a convolutional neural network, convolutional layers with different lengths and widths are used, and the predicted value of the multi-scale target is calculated by utilizing the depth characteristic; and finally, filtering the video frames without the power lines by using a power line detection model, and realizing real-time detection of the foreign matters in the power lines in the low-altitude aerial images by using a multi-target power line foreign matter detection model on the video with the detected power lines.

Description

Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network
Technical Field
The invention discloses a real-time detection method for power line foreign matters in aerial images of an unmanned aerial vehicle based on a deep learning technology. Firstly, a light power line detection model is constructed by utilizing a convolutional neural network, and the depth characteristics of power lines in aerial images are obtained through calculation; then, a multi-target power line foreign matter detection model is built by utilizing a convolutional neural network, convolutional layers with different lengths and widths are used, and the predicted value of the multi-scale target is calculated by utilizing the depth characteristic; and finally, filtering the video frames without the power lines by using a power line detection model, and realizing real-time detection of the foreign matters in the power lines in the low-altitude aerial images by using a multi-target power line foreign matter detection model on the video with the detected power lines. The invention belongs to the field of computer vision, and particularly relates to technologies such as deep learning and target detection.
Background
With the development of information technology, high-performance aerial photography sensors are widely applied to aerial photography. And the unmanned aerial vehicle technique matures day by day, more makes the low-altitude technique of taking photo by plane obtain very big development, has become a neotype wide practical technique of prospect. The method has the advantages that the low-altitude aerial image data show mass growth, show the characteristics of multiple angles, complex background and the like, and have important research significance and application value for realizing real-time and efficient processing of the low-altitude aerial image data. The method has important application in natural disaster assessment, transportation, urban planning and other aspects. Due to the efficient and safe characteristics of the unmanned aerial vehicle, power line inspection in the power system also becomes one of the important application fields.
The power line is an important national infrastructure and bears important responsibilities of power transportation, and the power line inspection is an important guarantee for ensuring the stable operation of a power system. With the high-speed development of electric power systems in China, the maturity of technologies such as long-distance high-voltage circuits, ultra-high voltage transmission and the like, electric power transportation has the characteristics of large transmission capacity and long transmission distance, and more power lines extend to mountains, fields and other complex geographic environments from cities. The traditional manual inspection mode is time-consuming and labor-consuming, is influenced by natural environment and climate, and restricts the construction of electric power systems in China. Unmanned aerial vehicle power line is patrolled and examined has high efficiency, and is safe, does not receive the characteristics that weather, topography influence, has become the important mode that the power line patrolled and examined. In the unmanned aerial vehicle power line inspection process, the digital camera that usable unmanned aerial vehicle connects to carry shoots low latitude power line image, has contained the basic condition of power line in these images. Through the processing of the low-altitude aerial power line images, the abnormal state of the power line can be found in time, and therefore rapid processing is carried out.
The image processing technology comprises image compression, segmentation, enhancement, description, identification and the like, and the target identification is one of important applications of the image processing technology. The traditional target identification technology is based on artificial features, and is difficult to process various targets under complex backgrounds and mass data. In recent years, a depth learning technology, which is a latest technology in the field of artificial intelligence, shows excellent performance in a target detection problem, for example, a depth target detection framework represented by ssd (single Shot multi box detector) and the like, and efficiency is further improved while high-precision identification is performed.
The invention provides an aerial image power line foreign matter detection method based on a multi-scale convolutional neural network, and aims to solve the problems that power line polling images are increasing day by day, the effect of a traditional target identification method is limited and the like. Firstly, a lightweight powerline detection model based on a Convolutional Neural Network (CNN) is constructed, and a multilevel depth feature of the powerline image is learned on a pre-training data set. And then, a power line foreign matter detection model based on the convolutional neural network is constructed, and the targets with different scales are processed by utilizing convolutional layers with different lengths and widths to obtain the predicted value of the multi-scale target. And then filtering irrelevant images without the power lines by using a power line detection module, and combining the multi-scale target predicted values. And finally, a multi-scale target prediction value is utilized, a non-maximum suppression (non maximum suppression) algorithm is used, a frame with high confidence coefficient is reserved, and the detection of the abnormal target of the power line is realized.
Disclosure of Invention
The invention provides a real-time power line foreign matter detection method based on a light convolutional neural network by utilizing a deep learning technology, which is different from the existing power line foreign matter detection method. Firstly, a light power line detection model is constructed by utilizing a Convolutional Neural Network (CNN), aiming at a single target of a power line, the light power line detection model is adopted, the number of layers is small, the requirement of single target detection can be met, and the training and detection time is effectively reduced. And (4) pre-training a network on the self-built power line image data set, and extracting the power line depth characteristic. Secondly, a convolutional neural network is utilized to train a power line foreign body detection model, convolutional layers with different lengths and widths are added into the model, and predicted values are calculated on a plurality of layers simultaneously. And then combining the outputs of different layers so as to learn the depth characteristics of the multi-scale target. The self-built power line foreign matter data set is used for pre-training, and a data augmentation method is adopted for random overturning, cutting and color changing, so that the data volume is expanded, and the generalization capability is further improved. And finally, in the stage of detecting the foreign matters in the power line, firstly removing irrelevant frames in the video by using a power line detection model, reserving key frames containing the power lines, extracting predicted values of power line boundary frames in the key frames, then detecting the key frames by using the power line foreign matter detection model, obtaining predicted values of all targets, filtering more similar boundary frames by using a non-extreme value inhibition algorithm, reserving boundary frames with higher confidence coefficient, and then realizing the quick and accurate detection of the foreign matters in the aerial power line image by using the obtained power line boundary frames and the foreign matter target boundary frames. The main process of the method is shown as attached figure 1 and can be divided into the following three steps: the method comprises the steps of power line foreign object target detection model construction based on a convolutional neural network, neural network pre-training and power line abnormal target detection.
(1) Power line detection model construction based on light convolutional neural network
The research object of the invention is aerial images, and in order to effectively remove irrelevant frames in videos, a power line detection model based on a light convolutional neural network is firstly constructed, the model network has a simple structure and fewer layers, and the detection real-time performance is ensured on the basis of effectively extracting the characteristics of power lines. Aiming at two categories of foreign matters on the power line, namely kites and balloons, a power line foreign matter detection model based on a light convolutional neural network is constructed, the two models detect step by step, and the detection precision is improved on the basis of improving the real-time performance.
(2) Neural network pre-training
For the power line detection model, a power line Image Dataset (Powerline Image Dataset) is used for pre-training, for the power line foreign matter detection model, balloon and kite pictures collected by a user are used as source data, translation, cutting and color changing are carried out by using a data augmentation algorithm, and therefore the power line foreign matter detection model is expanded to 4000 pieces as a training Dataset. The data set comprises power lines and power line foreign object images with different scales, lighting conditions and shooting angles, and depth characteristics under different conditions can be effectively learned.
(3) Power line anomaly target detection
The invention provides a multiple power line target detection method. Firstly, a power line detection model is utilized to carry out frame-by-frame detection on aerial videos, and irrelevant frames without power line targets are discarded. And for the key frames with the power line targets, further detecting by using a power line foreign object detection model, and calculating to obtain power line parameter predicted values and foreign object target boundary frame predicted values so as to judge whether the power line foreign object targets exist.
Compared with the prior art, the invention has the following obvious advantages and beneficial effects:
firstly, compared with the traditional artificial characteristic power line target identification method, the method utilizes the advanced convolutional neural network to construct the light power line detection model and the power line foreign matter detection model, realizes irrelevant frame filtering of power line images, greatly improves the detection efficiency, and ensures the real-time performance of power line foreign matter detection by utilizing the light network. Experiments prove that irrelevant frames can be effectively filtered in the aerial images of the unmanned aerial vehicle by adopting the structure, and the detection efficiency is greatly improved. Meanwhile, a multi-scale convolution layer is added in the light power line foreign matter detection model, and foreign matter image features of different scales are learned, so that the method is suitable for the multi-scale situation caused by shooting different targets at different distances of the unmanned aerial vehicle.
And finally, calculating a power line parameter predicted value and a foreign object target boundary frame predicted value by using a light power line model and a power line foreign object model for the screened power line image so as to judge whether the power line foreign object target exists or not.
Experiments prove that the deep neural network based on VGG-16 utilizes the multi-scale convolutional layer for learning, 74.3% of mAP (mean average probability) can be realized on a VOC2007 database, and the detection speed of 59FPS is kept. Therefore, the method is transferred to the task of detecting the abnormal target of the power line, and the method is feasible and has important application value for realizing efficient, accurate and real-time power line inspection.
Description of the drawings:
FIG. 1 is a flow chart of a method for detecting foreign matters in aerial image power lines based on a light convolutional neural network
FIG. 2 architecture diagram of a light power line detection model
FIG. 3 power line foreign matter detection model architecture diagram
FIG. 4 a diagram of a process for detecting foreign objects on a power line
Detailed Description
Based on the above description, a specific implementation flow is as follows, but the scope of protection of this patent is not limited to this implementation flow.
Step 1: power line foreign object target detection model construction based on convolutional neural network
Step 1.1: power line detection model construction based on light convolutional neural network
The existing deep learning target detection model has a wide application scene, and often can detect thousands of objects, such as YOLO9000, and 9418 classes. In a power line inspection scene, the types of targets are extremely limited, mainly including three types of power lines, balloons and kites, in the model, only the power lines need to be identified, the characteristics are extremely limited, the existing deep learning target detection model is too redundant for the power line scene, the lightweight model is effective under the condition, the lightweight model can identify the limited types of targets, and meanwhile, the detection speed is improved.
The light convolutional neural network is realized based on a mainstream open source deep learning framework Caffe, and the specific structure diagram of the step is shown in the attached figure 2. The method comprises the steps of inputting power line aerial images, carrying out convolution through 6 convolution layers, enabling the convolution kernel size to be 3 x 3, enabling the first four convolution layers to be subjected to Batch Normalization (Batch Normalization), enabling input of a subsequent activation function to be normalized, enabling batches to be in standard normal distribution (the mean value is 1 and the standard deviation is 0), enabling numerical values to be more stable, and enabling the model convergence speed to be faster by adopting a Linear correction Unit (RecUu) as the activation function after Batch Normalization. Max Pooling (Max Pooling) operation is performed after the 4 th convolutional layer, thereby reducing feature dimensionality and computational complexity. In the fifth convolution layer, we use a 3 × 3 convolution kernel as the class prediction module, and the number of output channels is 6, and each channel corresponds to the confidence of an anchor frame. In the sixth convolutional layer, we use a 3 × 3 convolutional kernel to predict the bounding box. And for each prediction frame, determining the category of the prediction frame according to the calculated category prediction value, and filtering the prediction frame belonging to the background. Then, the prediction boxes with lower threshold are filtered out with a confidence threshold of 0.5, and the first 200 prediction boxes with higher confidence are retained. Finally, adopting a non-maximum value to inhibit NMS algorithm, filtering out a prediction box with a threshold value larger than 0.7, and finally obtaining a prediction result
Step 1.2: power line foreign matter detection model construction based on convolutional neural network
The detailed structure diagram of the convolutional neural network proposed in this step is shown in fig. 3. The method comprises the steps of inputting aerial images of the foreign bodies of the power lines, carrying out convolution on the aerial images through 10 convolution layers, wherein the size of a convolution kernel is 3 x 3, the first 6 convolution layers serve as a main network and are used for extracting target features of the foreign bodies of the power lines, and pooling operation is added after the 2 nd, 4 th and 6 th convolution layers. The 7 th layer and the 8 th layer are respectively provided with a convolution kernel of 3 multiplied by 3, Batch Normalization (Batch Normalization) is adopted, the input of a subsequent activation function is normalized, the Batch is in standard normal distribution (the average value is 1, the standard deviation is 0), the numerical value is more stable, and a Linear correction Unit (Rectised Linear Unit, ReLU) is adopted as the activation function after Batch Normalization, so that the convergence speed of the model is higher. And adding a maximum pooling layer with the span of 2 after 7 and 8 layers respectively, and halving the length and the width of the input feature. 7 th, 8 th, 9 th and 10 th convolutional layers are used as prediction modules, and each module comprises two 3 x 3 convolutional layers for class prediction and bounding box prediction respectively, so that prediction values of different scales among multiple layers are reserved. And then, converting the multi-scale prediction value into a two-dimensional array, wherein the first dimension is the number of samples, the second dimension is the number of channels, and all outputs are spliced together on the second dimension to realize the combination of the multi-scale prediction values. And for each prediction frame, determining the category of the prediction frame according to the calculated category prediction value, and filtering the prediction frame belonging to the background. Then, the prediction boxes with lower threshold are filtered out with a confidence threshold of 0.5, and the first 200 prediction boxes with higher confidence are retained. And finally, adopting a non-maximum value to inhibit NMS algorithm, filtering out the prediction box with the threshold value larger than 0.7, and finally obtaining the prediction result.
Step 2: neural network pre-training
The method uses the power line image data set to train the light power line detection model, uses the power line foreign matter data set to train the power line foreign matter model, sends the power line image to the power line detection model, filters irrelevant frames, and sends the key frame containing the power line to the power line foreign matter detection model, thereby realizing the real-time power line foreign matter target detection.
Step 2.1: target detection model pre-training
Step 2.1.1: constructing a pre-training data set
In the pre-training stage, a power line Image Dataset (Powerline Image Dataset) is selected to train a power line target detection model, and the power line target detection model comprises 2000 aerial images of a power line and 2000 aerial images of a background. The power line aerial images are taken from different regions in different seasons, and the image size is 512 x 512. The power line foreign matter detection model is trained by selecting a power line foreign matter data set, the power line foreign matter detection model comprises 1000 aerial images of balloons and kites, and the aerial images cover different angles, regions and backgrounds.
2.1.2 model pretraining
In the power line foreign object scene, the frame may appear at any position of the picture and have any size. In order to simplify the search process, the power line foreign object model uses a default bounding box, i.e., an anchor box, and uses the anchor box as a search starting point. The arrangement of the anchor frame includes two aspects of dimension and aspect ratio. For an input size w × h, for a given size s ∈ (0, 1), a bounding box of size ws × hs will be generated; for a given ratio r>0, will generate a size of
Figure BDA0001662084780000061
The bounding box of (2). In the invention, s is 0.1, 0.25 and 0.5, and r is 0.5, 1 and 2. For each input pixel, the default anchor box is sampled 5 at its center. In the training process, firstly, the real value (ground route) in the training data is determined to be matched with which anchor frame, and the boundary frame corresponding to the anchor frame corresponding to the real value is used for prediction. For each real object in the photograph, the anchor box with the largest Intersection over Union (IoU) value matches it. The intersection ratio is a probability value describing the bounding box distance, as shown in equation (1):
Figure BDA0001662084780000062
wherein alpha is a prediction result, xi is a real boundary value, a large intersection ratio indicates that two frames are very similar, and a small intersection ratio indicates that the two frames are not similar. For the remaining unmatched anchor boxes, if IoU for a certain real value is greater than the threshold of 0.5, then the anchor box will also match this real value.
In the power line detection model and the power line foreign object detection model, L (x, c, L, g) represents a loss function, defined as a weighted sum of a position error (loc) and a confidence error (conf), as shown in formula (2), x is an input training image, c is a category confidence prediction value, L is a prediction value of a bounding box corresponding to an anchor frame, g is a position parameter of a true value, N is the number of positive samples of the anchor frame, and α is an adjustment ratio of a foreground loss function and a background loss function, where 1 is taken.
Figure BDA0001662084780000071
Lloc(x, l, g) is the loss function of the bounding box prediction, as shown in equation (3). Where cx, cy are the center coordinates of the bounding box, w, h are the bounding box width and height, and the anchor frame position is defined by d ═ d (d)cx,dcy,dw,dh) The corresponding bounding box is represented by b ═ bcx,bcy,bw,bh),
Figure BDA0001662084780000072
Figure BDA0001662084780000073
I.e., the conversion value of the bounding box with respect to the anchor box, is calculated according to the equations (4), (5), (6) and (7).
Figure BDA0001662084780000074
And the predicted value of the m parameter of the boundary frame corresponding to the ith anchor frame is obtained. Pos represents a positive sample set, i represents an anchor box number, and j represents a true value number. When in use
Figure BDA0001662084780000075
When the ith anchor frame is matched with the jth real value, and the category of the real value is k, when
Figure BDA0001662084780000076
A time indicates a mismatch. For position error, the Smooth L1 function is used.
Figure BDA0001662084780000077
Figure BDA0001662084780000078
Figure BDA0001662084780000079
Figure BDA00016620847800000710
Figure BDA00016620847800000711
Lconf(x, c) represents a loss function for class prediction, where x represents the input image, Neg represents the set of negative samples, o represents the anchor box number taken from the positive samples, represents the anchor box number taken from the negative samples, and t represents the true value number.
Figure BDA0001662084780000081
Is used for explaining the matching state when
Figure BDA0001662084780000082
Indicates that the o-th anchor box matches the t-th true value, and the class of true values is p when
Figure BDA0001662084780000083
A time indicates a mismatch. As shown in equation (8):
Figure BDA0001662084780000084
subsequently, a minimization loss function is trained. And (3) minimizing the cost function by adopting a random gradient descent (SGD) method, calculating and predicting the characteristic diagram results of the convolution layers with different scales, and combining the prediction outputs of different layers. Pre-training requires all data sizes to be normalized, so the present invention resets the original image to 512 x 512 pixels for pre-training. The learning rate is the most important parameter of the random gradient descent method, and determines the updating speed of the weight value. The momentum parameter and the weight attenuation factor can improve the training adaptivity. Through experimental observation, the learning rate is set to 10 by the invention-3The momentum parameters were set to 0,99, and the weight decay factor was set to default 0.0005. the stochastic gradient descent learning process was accelerated by an NVIDIA TITAN XP device for 60000 iterations.
The detailed pre-training process of the power line target detection model is as follows, wherein
Figure BDA0001662084780000085
For the initial power line boundary and class predictions, c1,l1For the final power line boundary prediction value and the class prediction value,
Figure BDA0001662084780000086
representing the network parameters of the power line detection model, and u belongs to (0,15) as the sequence number of parameter iteration.
1) Reading in a power line image dataset and initializing a power line detection model
2) Calculating by using power line detection network, and outputting boundary prediction value
Figure BDA0001662084780000087
And category prediction values
Figure BDA0001662084780000088
3) Will be provided with
Figure BDA0001662084780000089
And
Figure BDA00016620847800000810
inputting loss functions and summing the outputs of the loss functions, i.e. combining the outputs of the two loss functions to obtain a loss output value
Figure BDA00016620847800000811
4) According to
Figure BDA00016620847800000812
Training a power line detection network by using SGD, and updating parameters to
Figure BDA00016620847800000813
5) According to
Figure BDA00016620847800000814
Training a power line detection network by using SGD, and updating parameters to
Figure BDA00016620847800000815
6) Repeating the steps 2-5 for 15 times to obtain a power line detection model pre-training final parameter beta1,c1,l1
The detailed pre-training process of the power line foreign object detection model is as follows, wherein
Figure BDA00016620847800000816
For the initial power line boundary and class predictions, c2,l2For the final power line boundary prediction value and the class prediction value,
Figure BDA0001662084780000091
representing the network parameters of the power line detection model, and u belongs to (0,15) as the sequence number of parameter iteration.
1) Reading in a power line foreign object image dataset and initializing a power line detection model
2) Calculating by using power line detection network, and outputting boundary prediction value
Figure BDA0001662084780000092
And category prediction values
Figure BDA0001662084780000093
3) Will be provided with
Figure BDA0001662084780000094
And
Figure BDA0001662084780000095
inputting loss functions and summing the outputs of the loss functions, i.e. combining the outputs of the two loss functions to obtain a loss output value
Figure BDA0001662084780000096
4) According to
Figure BDA0001662084780000097
Training a power line detection network by using SGD, and updating parameters to
Figure BDA0001662084780000098
5) According to
Figure BDA0001662084780000099
Training a power line detection network by using SGD, and updating parameters to
Figure BDA00016620847800000910
6) Repeating the steps 2-5 for 15 times to obtain a power line detection model pre-training final parameter beta2,c2,l2
And step 3: power line foreign object identification
In the aerial photography image, a large number of irrelevant frames exist, such as the take-off and landing of an unmanned aerial vehicle and the peripheral flight process of a power line, and the irrelevant frames do not include power line targets, so that the identification efficiency of the power line foreign matter image is reduced.
Step 3.1 Power line mesh detection
Firstly, inputting a video frame into a power line target detection model, outputting a boundary box predicted value and a category predicted value, then using a non-maximum suppression algorithm, reserving a boundary box with higher confidence coefficient, and finally drawing a frame.
Step 3.1.1 Power line object class and boundary prediction
An image x to be detected is detectediSending into a power line target detection model, and outputting a predicted boundary value c1And a class predictor l1Since each pixel generates several anchor boxes, we predict a large number of similar table boxes.
Step 3.1.2 Power line target class and boundary prediction result optimization
For a large number of similar table frames calculated in step 3.1.1, we will use non-maximum suppression to suppress redundant frames, sort all frames according to confidence, select the frame with the highest confidence, then traverse all the rest frames, if the IoU value with the frame with the highest score is larger than the threshold value 0.8, we delete it, repeat the above process continuously, and finally keep the frame with higher confidence. Finally, in the frame set after the non-maximum suppression processing, a frame with a confidence exceeding 0.6 is drawn as a final frame.
3.2 Power line foreign object target detection
Inputting the key frame containing the power line processed in the step 3.1 into a power line foreign object target detection model, calculating a foreign object target boundary predicted value and a category predicted value, drawing a foreign object boundary frame of the key frame with foreign objects, judging whether the key frame is overlapped with the power line boundary frame, and finally drawing the overlapped boundary frame.
Step 3.2.1 Power line foreign object class and boundary prediction
A to-be-detected image x is detectediSending into a foreign object detection model of the power line, and outputting a prediction boundaryValue c2And a class predictor l2Each pixel generates several anchor boxes, so we predict a large number of similar table boxes.
Step 3.2.2 optimization of Power line target class and boundary prediction results
For a large number of similar table boxes calculated in step 3.2.2, we will use non-maximum suppression to suppress redundant boxes and keep the bounding box with confidence above 0.6. And then comparing the predicted value of the power line foreign body frame with the predicted value of the power line foreign body frame, deleting IoU frames with the value of 0, and finally drawing the rest frames.
Step 3.3: evaluation of test results
The invention uses the average absolute error-based criterion to evaluate the boundary prediction result. The mean absolute error is MAE, and the formula is as follows:
ei=|fi-yi| (9)
Figure BDA0001662084780000101
wherein f isiIndicates the predicted value, yiRepresenting true value yi,eiAbsolute error.

Claims (3)

1. Low latitude image power line foreign matter detection method of taking photo by plane based on light-duty convolution neural network, its characterized in that:
firstly, a light power line detection model is constructed by utilizing a convolutional neural network, a network is pre-trained on a self-constructed power line image data set, and the depth characteristic of a power line is extracted; secondly, training a power line foreign matter detection model by using a convolutional neural network, adding convolutional layers with different lengths and widths into the model, and simultaneously calculating predicted values in a plurality of layers; then combining the outputs of different layers so as to learn the depth characteristics of the multi-scale target; pre-training by using a self-built power line foreign matter data set, and randomly turning, cutting and changing colors by adopting a data augmentation method, so that the data volume is expanded, and the generalization capability is further improved; finally, in the stage of detecting the foreign matter in the power line, firstly, removing irrelevant frames in a video by using a power line detection model, reserving key frames containing power lines, extracting predicted values of power line boundary frames in the key frames, then, detecting the key frames by using the power line foreign matter detection model, obtaining predicted values of all targets, filtering more similar boundary frames by using a non-extreme value inhibition algorithm, reserving boundary frames with higher confidence coefficient, and then, detecting the foreign matter in aerial power line images by using the obtained power line boundary frames and foreign matter target boundary frames;
step 1: power line foreign object target detection model construction based on convolutional neural network
Step 1.1: power line detection model construction based on light convolutional neural network
Inputting a power line aerial image, performing convolution through 6 convolution layers, wherein the size of the convolution kernel is 3 multiplied by 3, batch normalization is adopted for the first four convolution layers, the input of a subsequent activation function is normalized, the batch is in standard normal distribution, and a linear correction unit is adopted as the activation function after the batch normalization, so that the convergence speed of the model is higher; performing maximum pooling operation after the 4 th convolution layer, thereby reducing characteristic dimensionality and reducing calculated amount; in the fifth convolution layer, a 3 multiplied by 3 convolution kernel is used as a class prediction module, the number of output channels is 6, and each channel corresponds to the confidence coefficient of one anchor frame; in the sixth convolutional layer, a 3 × 3 convolutional kernel is used to predict the bounding box; for each prediction frame, determining the category of the prediction frame according to the calculated category prediction value, and filtering the prediction frames belonging to the background; then, filtering out the prediction boxes with lower thresholds by using a confidence threshold of 0.5, and reserving the first 200 prediction boxes with higher confidence; finally, adopting a non-maximum value to inhibit an NMS algorithm, filtering out a prediction box with a threshold value larger than 0.7, and finally obtaining a prediction result;
step 1.2: power line foreign matter detection model construction based on convolutional neural network
Inputting an aerial image of the foreign matter on the power line, performing convolution through 10 convolution layers, wherein the size of a convolution kernel is 3 multiplied by 3, the first 6 convolution layers are used as a main network for extracting the target feature of the foreign matter on the power line, and pooling operation is added after the 2 nd, 4 th and 6 th convolution layers; 7,8 layers are respectively provided with a convolution kernel of 3 multiplied by 3, batch normalization is adopted, the input of a subsequent activation function is normalized, the batch is in standard normal distribution, the numerical value is more stable, and a linear correction unit is adopted as the activation function after batch normalization, so that the convergence speed of the model is higher; adding a maximum pooling layer with the span of 2 after 7 layers and 8 layers respectively, and halving the length and the width of the input features; 7,8,9,10 convolutional layers as prediction modules, each module contains two 3 × 3 convolutional layers for class prediction and boundary frame prediction, so that prediction values of different scales among multiple layers are reserved; then, converting the data into a two-dimensional array, wherein the first dimension is the number of samples, the second dimension is the number of channels, and all outputs are spliced together on the second dimension to realize the combination of multi-scale predicted values; for each prediction frame, determining the category of the prediction frame according to the calculated category prediction value, and filtering the prediction frames belonging to the background; then, filtering out the prediction boxes with lower thresholds by using a confidence threshold of 0.5, and reserving the first 200 prediction boxes with higher confidence; finally, adopting a non-maximum value to inhibit NMS algorithm, filtering out a prediction box with a threshold value larger than 0.7, and finally obtaining a prediction result;
and 2, step: neural network pre-training
The method comprises the steps of training a light power line detection model by using a power line image data set, training a power line foreign matter model by using a power line foreign matter data set, firstly sending a power line image into the power line detection model, filtering irrelevant frames, and then sending a key frame containing a power line into the power line foreign matter detection model, thereby realizing real-time power line foreign matter target detection;
step 2.1: target detection model pre-training
Step 2.1.1: constructing a pre-training data set
In the pre-training stage, a power line image data set is selected to train a power line target detection model, and the power line target detection model comprises a plurality of power line aerial images and a plurality of background aerial images; the power line aerial images are taken from different regions in different seasons, and the image size is 512 multiplied by 512; selecting a power line foreign matter data set to train a power line foreign matter detection model, wherein the model comprises two aerial images of a balloon and a kite, and covers different angles, regions and backgrounds;
2.1.2 model pretraining
In a foreign matter scene of the power line, the frame can appear at any position of the picture and has any size; the power line foreign body model uses a default boundary frame, namely an anchor frame, and takes the anchor frame as a search starting point; the setting of the anchor frame comprises two aspects of dimension and aspect ratio; for an input size w × h, for a given size s ∈ (0, 1), a bounding box of size ws × hs will be generated; for a given ratio r>0, will generate a size of
Figure FDA0003654508510000021
The bounding box of (1); s is 0.1, 0.25 and 0.5, r is 0.5, 1 and 2; for each input pixel, sampling the default anchor box 5 at its center; in the training process, firstly, determining which anchor frame the real value in the training data is matched with, and predicting the boundary frame corresponding to the anchor frame corresponding to the real value; for each real object in the photograph, the anchor box with which the intersection ratio IoU is the greatest matches; the intersection ratio is a probability value describing the bounding box distance, as shown in equation (1):
Figure FDA0003654508510000031
wherein alpha is a prediction result, xi is a real boundary value, a large cross-over ratio indicates that two frames are very similar, and a small cross-over ratio indicates that the two frames are dissimilar; for the remaining unmatched anchor boxes, if IoU for a real value is greater than the threshold of 0.5, then the anchor box will also match this real value;
in the power line detection model and the power line foreign matter detection model, L (x, c, L, g) represents a loss function and is defined as a weighted sum of a position error and a confidence error, as shown in formula (2), x is an input training image, c is a category confidence prediction value, L is a predicted value of a boundary frame corresponding to an anchor frame, g is a position parameter of a true value, N is the number of positive samples of the anchor frame, and alpha is an adjustment ratio of a foreground loss function and a background loss function, wherein 1 is taken;
Figure FDA0003654508510000032
Lloc(x, l, g) is the loss function of the bounding box prediction, as shown in equation (3); where cx, cy are the center coordinates of the bounding box, w, h are the bounding box width and height, and the anchor frame position is defined by d ═ d (d)cx,dcy,dw,dh) The corresponding bounding box is represented by (b) incx,bcy,bw,bh),
Figure FDA0003654508510000033
The conversion value of the boundary frame relative to the anchor frame is calculated according to the formulas (4), (5), (6) and (7);
Figure FDA0003654508510000034
the predicted value of the m parameter of the boundary frame corresponding to the ith anchor frame is obtained; pos represents a positive sample set, i represents an anchor frame serial number, and j represents a true value serial number; when in use
Figure FDA0003654508510000035
When the ith anchor frame is matched with the jth real value, and the category of the real value is k, when
Figure FDA0003654508510000036
Time indicates a mismatch; for the position error, a Smooth L1 function is adopted;
Figure FDA0003654508510000037
Figure FDA0003654508510000038
Figure FDA0003654508510000041
Figure FDA0003654508510000042
Figure FDA0003654508510000043
Lconf(x, c) represents a loss function of class prediction, wherein x represents an input image, Neg represents a negative sample set, o represents an anchor frame serial number in a sample, and t represents a true value serial number;
Figure FDA0003654508510000044
is used for explaining the matching state when
Figure FDA0003654508510000045
Indicates that the o-th anchor box matches the t-th true value, and the class of true values is p when
Figure FDA0003654508510000046
Time indicates a mismatch; as shown in equation (8):
Figure FDA0003654508510000047
then, training is carried out by minimizing a loss function; minimizing the cost function by adopting a random gradient descent method, calculating and predicting the characteristic diagram results of the plurality of convolution layers with different scales, and combining the prediction outputs of different layers; pre-training requires normalizing all data sizes, so the original image is reset to 512 × 512 pixels for pre-training; set learning rate to 10-3The momentum parameter is set to 0.99, the weight attenuation factor is set to default 0.0005, and the random gradient descent learning process is accelerated by NVIDIA TITAN XP equipment for more than 60000 iterations.
2. The detection method according to claim 1, characterized in that:
the detailed pre-training process of the power line target detection model is as follows, wherein
Figure FDA0003654508510000048
For the initial power line boundary and class predictions, c1,l1For the final power line boundary prediction value and the category prediction value,
Figure FDA0003654508510000049
representing a power line detection model network parameter, wherein u belongs to (0,15) as a parameter iteration sequence number;
1) reading in a power line image dataset and initializing a power line detection model
2) Calculating by using power line detection network, and outputting boundary prediction value
Figure FDA00036545085100000410
And category prediction values
Figure FDA00036545085100000411
3) Will be provided with
Figure FDA00036545085100000412
And
Figure FDA00036545085100000413
inputting loss functions and summing the outputs of the loss functions, i.e. combining the outputs of the two loss functions to obtain a loss output value
Figure FDA00036545085100000414
4) According to
Figure FDA00036545085100000415
Training electricity with SGDThe force line detecting network updates the parameters to
Figure FDA00036545085100000416
5) According to
Figure FDA0003654508510000051
Training a power line detection network by using SGD, and updating parameters to
Figure FDA0003654508510000052
6) Repeating the steps 2-5 for 15 times to obtain a power line detection model pre-training final parameter beta1,c1,l1
The detailed pre-training process of the power line foreign object detection model is as follows, wherein
Figure FDA0003654508510000053
For the initial power line boundary and class predictions, c2,l2For the final power line boundary prediction value and the class prediction value,
Figure FDA0003654508510000054
representing the network parameters of the power line detection model, and taking u e (0,15) as the sequence number of parameter iteration;
1) reading in a power line foreign object image dataset and initializing a power line detection model
2) Calculating by using power line detection network, and outputting boundary prediction value
Figure FDA0003654508510000055
And category prediction values
Figure FDA0003654508510000056
3) Will be provided with
Figure FDA0003654508510000057
And
Figure FDA0003654508510000058
inputting loss functions and summing the outputs of the loss functions, i.e. combining the outputs of the two loss functions to obtain a loss output value
Figure FDA0003654508510000059
4) According to
Figure FDA00036545085100000510
Training a power line detection network by using SGD, and updating parameters to
Figure FDA00036545085100000511
5) According to
Figure FDA00036545085100000512
Training a power line detection network by using SGD, and updating parameters to
Figure FDA00036545085100000513
6) Repeating the steps 2-5 for 15 times to obtain a power line detection model pre-training final parameter beta2,c2,l2
3. The detection method according to claim 1, characterized in that:
and 3, step 3: power line foreign object identification
Firstly, a power line detection model is used for detection, frames without power lines are not processed, and the frames with the power lines are subjected to next power line foreign matter detection, so that the overall detection speed is improved;
step 3.1 Power line target detection
Firstly, inputting a video frame into a power line target detection model, outputting a boundary box predicted value and a category predicted value, then using a non-maximum suppression algorithm, reserving a boundary box with higher confidence coefficient, and finally drawing a frame;
step 3.1.1 Power line object class and boundary prediction
An image x to be detected is detectediSending into a power line target detection model, and outputting a predicted boundary value c1And a class predictor l1Since each pixel will generate several anchor frames, a large number of similar frames will be predicted;
step 3.1.2 Power line target class and boundary prediction result optimization
For a large number of similar frames calculated in the step 3.1.1, inhibiting redundant frames by adopting non-maximum inhibition, sorting all the frames according to confidence degrees, selecting the frame with the highest confidence degree, traversing all the rest frames, deleting the frame with the highest current score if the value IoU of the frame with the highest current score is greater than the threshold value 0.8, continuously repeating the process, and finally keeping the frame with the higher confidence degree; finally, in the frame set after the non-maximum suppression processing, a frame with the confidence coefficient exceeding 0.6 is drawn as a final frame;
3.2 Power line foreign object target detection
Inputting the key frame containing the power line processed in the step 3.1 into a power line foreign object target detection model, calculating a foreign object target boundary predicted value and a category predicted value, drawing a foreign object boundary frame of the key frame with foreign objects, judging whether the key frame is overlapped with the power line boundary frame, and finally drawing the overlapped boundary frame;
step 3.2.1 Power line foreign object class and boundary prediction
An image x to be detected is detectediSending into a foreign object detection model of the power line, and outputting a predicted boundary value c2And a class predictor l2Each pixel generates a plurality of anchor frames, so a large number of similar frames can be predicted;
step 3.2.2 optimization of Power line target class and boundary prediction results
For a large number of similar frames calculated in the step 3.2.2, adopting non-maximum suppression to suppress redundant frames, and reserving frames with confidence degrees exceeding 0.6; then comparing the predicted value of the power line foreign matter frame with the predicted value of the power line frame, deleting IoU frames with the value of 0, and finally drawing the rest frames;
step 3.3: evaluation of test results
Evaluating the boundary prediction result by using a standard based on the average absolute error; the mean absolute error is MAE, and the formula is as follows:
ei=|fi-yi| (9)
Figure FDA0003654508510000061
wherein f isiIndicates the predicted value, yiRepresenting true value yi,eiIs an absolute error.
CN201810465955.2A 2018-05-16 2018-05-16 Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network Active CN108647655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810465955.2A CN108647655B (en) 2018-05-16 2018-05-16 Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810465955.2A CN108647655B (en) 2018-05-16 2018-05-16 Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network

Publications (2)

Publication Number Publication Date
CN108647655A CN108647655A (en) 2018-10-12
CN108647655B true CN108647655B (en) 2022-07-12

Family

ID=63755957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810465955.2A Active CN108647655B (en) 2018-05-16 2018-05-16 Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network

Country Status (1)

Country Link
CN (1) CN108647655B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472214A (en) * 2018-10-17 2019-03-15 福州大学 One kind is taken photo by plane foreign matter image real-time detection method based on deep learning
CN109389322A (en) * 2018-10-30 2019-02-26 福州大学 The disconnected broken lot recognition methods of grounded-line based on target detection and long memory models in short-term
CN109726741B (en) * 2018-12-06 2023-05-30 江苏科技大学 Method and device for detecting multiple target objects
CN109902730B (en) * 2019-02-21 2021-06-01 国网山东省电力公司临沂供电公司 Power transmission line broken strand detection method based on deep learning
CN110163081A (en) * 2019-04-02 2019-08-23 宜通世纪物联网研究院(广州)有限公司 SSD-based real-time regional intrusion detection method, system and storage medium
CN110033451A (en) * 2019-04-17 2019-07-19 国网山西省电力公司电力科学研究院 A kind of power components defect inspection method based on SSD framework
CN110033453B (en) * 2019-04-18 2023-02-24 国网山西省电力公司电力科学研究院 Power transmission and transformation line insulator aerial image fault detection method based on improved YOLOv3
CN110070530B (en) * 2019-04-19 2020-04-10 山东大学 Transmission line icing detection method based on deep neural network
CN110175524A (en) * 2019-04-26 2019-08-27 南京航空航天大学 A kind of quick vehicle checking method of accurately taking photo by plane based on lightweight depth convolutional network
CN110232370B (en) * 2019-06-21 2022-04-26 华北电力大学(保定) Power transmission line aerial image hardware detection method for improving SSD model
CN111753606A (en) * 2019-07-04 2020-10-09 杭州海康威视数字技术股份有限公司 Intelligent model upgrading method and device
CN110796186A (en) * 2019-10-22 2020-02-14 华中科技大学无锡研究院 Dry and wet garbage identification and classification method based on improved YOLOv3 network
CN112364878A (en) * 2020-09-25 2021-02-12 江苏师范大学 Power line classification method based on deep learning under complex background
CN113670929B (en) * 2021-07-05 2024-06-14 国网宁夏电力有限公司电力科学研究院 Power transmission line foreign matter detection method and device, storage medium and terminal equipment
CN113723181B (en) * 2021-07-20 2023-10-20 深圳大学 Unmanned aerial vehicle aerial photographing target detection method and device
CN114001833A (en) * 2021-11-25 2022-02-01 三门核电有限公司 Equipment infrared imaging diagnosis method and system based on target detection
CN114723678B (en) * 2022-03-21 2024-08-02 盛视科技股份有限公司 High-voltage electric wire foreign matter detection method and detection system based on video image
CN115311584A (en) * 2022-08-15 2022-11-08 贵州电网有限责任公司 Unmanned aerial vehicle high-voltage power grid video inspection floating hanging method based on deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN106971152A (en) * 2017-03-16 2017-07-21 天津大学 A kind of method of Bird's Nest in detection transmission line of electricity based on Aerial Images
CN107392901A (en) * 2017-07-24 2017-11-24 国网山东省电力公司信息通信公司 A kind of method for transmission line part intelligence automatic identification
CN107423760A (en) * 2017-07-21 2017-12-01 西安电子科技大学 Based on pre-segmentation and the deep learning object detection method returned
CN107563412A (en) * 2017-08-09 2018-01-09 浙江大学 A kind of infrared image power equipment real-time detection method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN106971152A (en) * 2017-03-16 2017-07-21 天津大学 A kind of method of Bird's Nest in detection transmission line of electricity based on Aerial Images
CN107423760A (en) * 2017-07-21 2017-12-01 西安电子科技大学 Based on pre-segmentation and the deep learning object detection method returned
CN107392901A (en) * 2017-07-24 2017-11-24 国网山东省电力公司信息通信公司 A kind of method for transmission line part intelligence automatic identification
CN107563412A (en) * 2017-08-09 2018-01-09 浙江大学 A kind of infrared image power equipment real-time detection method based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SSD: Single Shot MultiBox Detector;Wei Liu等;《Springer》;20160917;正文第25页第2段至第27页第2段、表5第33页第2段 *

Also Published As

Publication number Publication date
CN108647655A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN108647655B (en) Low-altitude aerial image power line foreign matter detection method based on light convolutional neural network
CN113449680B (en) Knowledge distillation-based multimode small target detection method
CN114022432B (en) Insulator defect detection method based on improved yolov5
CN110163213B (en) Remote sensing image segmentation method based on disparity map and multi-scale depth network model
CN109029363A (en) A kind of target ranging method based on deep learning
CN113111727B (en) Feature alignment-based method for detecting rotating target in remote sensing scene
CN109242826B (en) Mobile equipment end stick-shaped object root counting method and system based on target detection
CN113591617B (en) Deep learning-based water surface small target detection and classification method
CN110647977B (en) Method for optimizing Tiny-YOLO network for detecting ship target on satellite
CN111738114B (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN111126278A (en) Target detection model optimization and acceleration method for few-category scene
CN115953408B (en) YOLOv 7-based lightning arrester surface defect detection method
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN112785636A (en) Multi-scale enhanced monocular depth estimation method
CN113989612A (en) Remote sensing image target detection method based on attention and generation countermeasure network
CN114565824B (en) Single-stage rotating ship detection method based on full convolution network
CN107529647B (en) Cloud picture cloud amount calculation method based on multilayer unsupervised sparse learning network
CN117392382A (en) Single tree fruit tree segmentation method and system based on multi-scale dense instance detection
CN114359167A (en) Insulator defect detection method based on lightweight YOLOv4 in complex scene
CN112597875A (en) Multi-branch network anti-missing detection aerial photography target detection method
CN113627481A (en) Multi-model combined unmanned aerial vehicle garbage classification method for smart gardens
CN108830195A (en) Image classification method based on on-site programmable gate array FPGA
CN112132207A (en) Target detection neural network construction method based on multi-branch feature mapping
Yin et al. M2F2-RCNN: Multi-functional faster RCNN based on multi-scale feature fusion for region search in remote sensing images
WO2023222643A1 (en) Method for image segmentation matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant