CN110929646A - Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image - Google Patents

Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image Download PDF

Info

Publication number
CN110929646A
CN110929646A CN201911154496.7A CN201911154496A CN110929646A CN 110929646 A CN110929646 A CN 110929646A CN 201911154496 A CN201911154496 A CN 201911154496A CN 110929646 A CN110929646 A CN 110929646A
Authority
CN
China
Prior art keywords
power distribution
frames
frame
tower
distribution tower
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911154496.7A
Other languages
Chinese (zh)
Inventor
王仁书
陈彬
张松
谢朝辉
陈杰
林德源
刘冰倩
韩纪层
赵静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
Zhangzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Management and Training Center of State Grid Fujian Electric Power Co Ltd
Original Assignee
Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
State Grid Fujian Electric Power Co Ltd
Zhangzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd
Management and Training Center of State Grid Fujian Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd, State Grid Fujian Electric Power Co Ltd, Zhangzhou Power Supply Co of State Grid Fujian Electric Power Co Ltd, Management and Training Center of State Grid Fujian Electric Power Co Ltd filed Critical Electric Power Research Institute of State Grid Fujian Electric Power Co Ltd
Priority to CN201911154496.7A priority Critical patent/CN110929646A/en
Publication of CN110929646A publication Critical patent/CN110929646A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention relates to a rapid identification method of power distribution tower reverse-off information based on an unmanned aerial vehicle aerial image, which comprises the following steps: step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle, dividing the power distribution tower into a normal type and an inverted type, generating an XML format file and preprocessing the XML format file; step S2: performing clustering analysis on the labeling frames to determine 4 anchoring frames; step S3: establishing a power distribution tower reverse-off information rapid recognition model and designing a loss function to carry out error back propagation training to obtain an optimal weight; step S4: and applying the optimal weight to a rapid identification model of the inverted information of the power distribution tower, finally obtaining the position information of the normal and inverted power distribution towers, and completing rapid identification of the inverted information. The method has the characteristics of rapidness and light weight, is used for processing massive unmanned aerial vehicle aerial image data in real time, is suitable for a mobile end or an equipment end, and promotes the intellectualization of the future unmanned aerial vehicle aerial power distribution tower image processing.

Description

Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image
Technical Field
The invention relates to the field of image recognition, in particular to a rapid recognition method for power distribution tower reverse-off information based on aerial images of unmanned aerial vehicles.
Background
The electric wire netting of southeast coastal area often receives meteorological disaster like the influence of typhoon, leads to falling off of distribution tower, seriously threatens the safety of distribution network, and the distribution tower that is not handled in time very easily produces the secondary accident, threatens personal safety, therefore the maintenance of distribution tower with patrol and examine and become an important ring of distribution network.
In recent years, the unmanned aerial vehicle technology is rapidly developed, and functions of automatic flight, active obstacle avoidance, path planning, high-definition image shooting and the like can be realized. Therefore, the unmanned aerial vehicle is widely applied to power inspection in China. Under such background, unmanned aerial vehicle technique has also been applied to in the distribution shaft tower of taking photo by plane, becomes the auxiliary means that transmission line operation and maintenance overhauld, mainly follows the information of breaking of distribution shaft tower of artifical collection in the unmanned aerial vehicle image of taking photo by plane, and then carries out the salvage of promptness. However, each flight of the unmanned aerial vehicle generates a large number of pictures, the timeliness cannot be met in an artificial mode, and the situation of misjudgment and missed judgment is easily generated under a mechanized flow, so that the intellectualization of aerial image processing of the unmanned aerial vehicle is urgently needed.
Disclosure of Invention
In view of the above, the invention aims to provide a power distribution tower reverse-off information rapid identification method based on an unmanned aerial vehicle aerial image, which has the characteristics of rapidness and light weight, is used for processing massive unmanned aerial vehicle aerial image data in real time, and is suitable for a mobile terminal or an equipment terminal, so as to promote the intellectualization of future unmanned aerial vehicle aerial power distribution tower image processing.
The invention is realized by adopting the following scheme: a rapid identification method for power distribution tower reverse-off information based on unmanned aerial vehicle aerial images comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion;
step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information and designing a loss function to train error back propagation until the error converges to a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the aerial image of the unmanned aerial vehicle, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method (NMS), obtaining the position information of the final normal and reverse-off power distribution tower, and completing rapid identification of the reverse-off information.
Further, the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the (box) labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
Figure BDA0002284342940000031
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
Further, in step S3, establishing a fast identification model of the power distribution tower outage information and designing a loss function, specifically, the contents are as follows:
defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation;
defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution was 13 × 13 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory, and confidence of target frames, total 7 × 2 values, and finally the rapid identification model of the power distribution tower reverse fault information outputs 13 × 2+26 × 2 target frames, and total 1690; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
Figure BDA0002284342940000041
Figure BDA0002284342940000042
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyIs the coordinate of the center point of the predicted frame. The σ () function is a logistic function, normalizing the coordinates to between 0-1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
Figure BDA0002284342940000051
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;
Figure BDA0002284342940000052
adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes.
Further, the step of removing the repeated target frame by using the non-maximum suppression method in step S4 is:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio higher than the threshold value with the frame in the output list as 0 according to a preset IOU (intersection ratio) threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
Further, in the step S3, the image data input during the training is performed in an online random enhancement mode, including slight scaling of the image between 0.9 and 1.1, random cropping of 10%, and color enhancement mode, so as to enhance the generalization ability of the network to adapt to the lightweight condition of the model.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention carries out K-Means clustering analysis on the marking frame, thereby obtaining a better anchoring frame.
(2) Compared with the traditional feature extraction network, the feature extraction network and the resolution network used in the invention are pruned and improved, and the required calculation parameters are less, so that the network has the advantages of light weight and low requirement on equipment performance.
(3) The loss function mode designed by the invention can effectively reduce errors in positioning and classification, and the sources of the errors are considered more comprehensively, so that the model can achieve higher accuracy.
(4) The invention designs a method for enhancing data on line, performs random multi-scale training, enhances the robustness of the network and reduces the side effect caused by light weight of the network.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Fig. 2 is a diagram of a model network structure for rapidly identifying power distribution tower reverse-off information based on an unmanned aerial vehicle aerial image according to an embodiment of the invention.
FIG. 3 is a flowchart of a K-Means clustering method according to an embodiment of the present invention.
Fig. 4 is a flowchart of a non-maximum suppression method according to an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1 and 2, the embodiment provides a method for rapidly identifying disconnection information of a power distribution tower based on an aerial image of an unmanned aerial vehicle, the method includes two modules, namely a network training module and a network detection module, and when an identification task is executed, the network detection module is switched to; when no task is executed, the network training module is switched to learn the tower information characteristics of the newly added data set so as to adapt to the situation under the more complex background around the reversed pole.
The method comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion; the LabelImg is a visual image calibration tool and can generate an XML file of the labeling information for subsequent processing by a Python script.
Step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information, designing a loss function and training error back propagation until the error converges to be within a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the aerial image of the unmanned aerial vehicle, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method (NMS), obtaining the position information of the final normal and reverse-off power distribution tower, and completing rapid identification of the reverse-off information.
As shown in fig. 3, in this embodiment, the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the (box) labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
Figure BDA0002284342940000091
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
In this embodiment, the establishing a fast identification model of the power distribution tower outage information and designing a loss function in step S3 include:
the method comprises the steps of defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network has the characteristic of light weight, specifically, the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation; convolution is used to extract the characteristics of the distribution tower in the aerial image, and max-pooling is used to reduce the amount of computation.
Defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution of 1313 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory and confidence of target frames, total 7 x2 numerical values, and finally the rapid identification model of the power distribution tower reverse failure information outputs 13 x 2+26 x2 target frames, and total 1690 target frames; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
Figure BDA0002284342940000101
Figure BDA0002284342940000102
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyIs the coordinate of the center point of the predicted frame. The σ () function is a logistic function, normalizing the coordinates to between 0-1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
Figure BDA0002284342940000111
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;
Figure BDA0002284342940000112
adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes. And the confidence error of the frame adopts a binary cross entropy loss function, and the classification error of the output frame is added into the loss function.
As shown in fig. 4, in the present embodiment, the step of removing the repeated target frame by using the non-maximum suppression method in step S4 includes:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio higher than the threshold value with the frame in the output list as 0 according to a preset IOU (intersection ratio) threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
In this embodiment, in the step S3, an online random enhancement mode is adopted for the image data input during the training period, including a mode of slightly scaling the image between 0.9 and 1.1, randomly cropping 10%, color enhancement (HSV), and the like, so as to enhance the generalization ability of the network to adapt to the lightweight condition of the model.
Preferably, in the embodiment, the problem is regarded as a recognition and positioning problem, namely a target detection problem, and by designing a lightweight and easily-edited feature extraction network and designing a network optimization loss function, the network structure automatically learns the features of the normal tower and the reverse-disconnected tower, so that recognition and detection are realized at one time. The feature extraction network is light-weight, can be edited, and requires few calculation parameters, so that the real-time requirement on various devices can be met. During model training, image data input adopts an online random enhancement mode, including rotation, shearing, scaling, mapping, color enhancement and the like, so as to adapt to the condition of model lightweight.
In this embodiment, during the training in step S3, the image data input is enhanced on-line and then, the image data input includes slight scaling between (0.9-1.1), random cropping by 10%, color enhancement (HSV), and so on, so as to enhance the generalization ability of the network to adapt to the lightweight model.
In this embodiment, the training set in step S3 includes 12000 normal and abnormal tower pictures, a random gradient descent method is adopted, the number of single training batches is set to 16, a momentum factor is superimposed and set to 0.9, a weight attenuation coefficient is set to 0.005, an initial learning rate is set to 0.001, training is performed for 100 generations, the total number of training batches is 75000, and an error is set to 10-1
In this embodiment, the adopted feature extraction network may be tailored or modified according to hardware device conditions, so as to search out an optimal network structure.
In this embodiment, 4 anchor frames are determined by regression through a K-Means algorithm, so as to adapt to the size of the aerial tower image of the unmanned aerial vehicle, which is 13 × 13 scales and 26 × 26 scales, respectively, and each scale is two. The output is divided into K x K grids (K is 13 or 26), each of which will predict 2 boxes, relative to 2 anchor boxes.
It should be noted that most of the procedures are relatively common in the image processing field, but are necessary processes for network training. Firstly, high-definition aerial images collected by an unmanned aerial vehicle are processed, manual marking of the images is carried out, and marking content of the images comprises tower position information and tower types. Whether the network model is excellent or not is measured by taking the judgment accuracy of whether the network model can exceed that of a human being or not as a standard, so that the current labeling step is mainly manual.
Design of the loss function: the optimized direction is the negative direction of the derivative of the loss function, and the set network structure automatically learns the characteristics of the normal tower and the reverse tower, and identifies and positions the normal tower and the reverse tower. The specific loss function is designed as follows:
Figure BDA0002284342940000131
the calculation of the loss function includes positioning error and classification error, and the calculation of the loss function needs to be performed on the predicted central coordinate, the predicted boundary box, the predicted category and the predicted confidence coefficient. Lambda [ alpha ]coordFor the weight of the coordinate prediction error, λnoobjThe weight of the confidence prediction error.
Figure BDA0002284342940000141
Taking 0 or 1 to judge whether the jth prediction frame in the ith grid is responsible for the object, and the frame which is intersected with the real frame of the object to be detected and has the maximum ratio is responsible for the object to be detected, namely
Figure BDA0002284342940000142
Is set to 1. The first part of the formula calculates the offset between each prediction frame and the actual calibration frame under each grid, and the second part calculates the error between the length and the width of the prediction frame under the grid with the actual object and the actual calibration frame, and makes one with the area of the prediction frameAnd (4) balancing. And calculating the loss functions of the prediction category and the confidence coefficient of the frame with the maximum cross-over ratio, wherein the loss functions adopt binary cross entropy, so that the gradient explosion condition in the training process is reduced, and the convergence is increased.
In addition, the network structure of feature extraction has the characteristic of light weight, and comprises 7 volume blocks, wherein each volume block is composed of convolution operation and maximum pooling operation. The detection network adopts the resolution of 13 × 13 and 26 × 26, and is composed of convolution operations of 3 × 3 and 1 × 1, and finally, the direct prediction regression generates a positioning frame, confidence coefficient and category of the object to be identified. The adopted feature extraction network can be independently designed, cut or modified according to the condition of hardware equipment, so that an optimal network structure is searched. The method for identifying the reverse-off information of the power distribution tower based on the aerial image of the unmanned aerial vehicle also meets the characteristics of rapidness and light weight, is suitable for a mobile terminal or an equipment terminal, and can meet the real-time processing and judging requirements.
Aiming at the light weight characteristic in the network model, a step of online data enhancement is specially designed and randomly used as the input of the network. The method mainly comprises the modes of rotation, cutting, scaling, mapping, color enhancement and the like, so that the generalization and model actual performance capability of the lightweight network, such as average recall ratio (map) and intersection ratio (IOU), can be enhanced. In the real-time detection, because of the light weight characteristic of the feature extraction backbone network provided by the embodiment, the number of parameters needing to be learned is small, and therefore, the video processing speed can reach 30-40 frames per second on a common CPU device.
Specifically, one specific embodiment of this embodiment is:
at first handle the aerial image of taking photo of unmanned aerial vehicle collection, through artifical distribution tower and the normal distribution tower 12000 that marks the fall-off, be 1: the proportion of 1, including position and category information, generally uses the popular LabelImg labeling software during labeling to generate the labeling information in XML format, which is convenient for processing by scripting language in the next step. Then the script language divides the training set and the testing set, reads the XML format marking information, converts the coordinate and the category, and writes the XML format marking information into the text in the txt format to form a label. At this time, the sample and label files required for network training are prepared.
And (3) carrying out cluster analysis on the marking frames aiming at the power distribution tower images of the aerial images of the specific unmanned aerial vehicle, setting the cluster number to be 4, obtaining specific 4 anchoring frames, and outputting the offset and the size of the prediction frame by the later prediction frame on the basis of the four anchoring frames.
And designing a feature extraction network and a detection network, and constructing a lightweight feature extraction network under a Pythrch frame, wherein the feature extraction network comprises seven volume blocks, each volume block comprises convolution operation and maximum pooling operation, and convolution kernels of 3 x 3 and 1 x1 are used for filtering the image, filtering irrelevant information in the aerial image and extracting useful information related to tower features. And designing a loss optimization function under a Pythrch frame, wherein the loss optimization function comprises errors of a predicted rectangular box position and an actual position, errors of a predicted certain category confidence coefficient and an actual category and the like. The detection network uses 13 × 13, 26 × 26 resolution feature networks, in which 13 × 14 and 26 × 14 feature vectors are generated, and the last dimension contains the confidence, the center coordinates of the target frame, the width and height of the target frame, and the tower type.
12000 normal and abnormal tower pictures exist in the training set, a random gradient descent method is adopted for training, the number of single training batches is set to be 16, a momentum factor is superposed and set to be 0.9, a weight attenuation coefficient is set to be 0.005, an initial learning rate is set to be 0.001, 100 training generations are performed, the number of total training batches is 75000, and an error is set to be 10-1. In the step of training the network, the step of training the online enhancement of the data set is specially designed, and the method mainly comprises the modes of rotation, cutting, scaling, mapping, color enhancement and the like, so that the generalization and model actual performance capability of the lightweight network is enhanced.
When the optimal weight is used for applying the rapid identification model, 1690 prediction frames are obtained, the non-maximum value of the mark frame meeting the category confidence coefficient threshold is required to be inhibited, and the prediction frame with the maximum confidence coefficient is obtained, so that the position and the type of the power distribution tower are obtained, and the rapid identification of the power distribution tower outage information is realized.
The feature extraction provided by the embodiment has the advantages that the required training parameters are less, the feedforward reasoning of the network, namely the identification of the reverse-off information of the power distribution tower of the aerial image, and the required calculated amount is correspondingly reduced, so that the effect of real-time response can be achieved. Meanwhile, the feature extraction network provided by the embodiment can be cut and added very easily to adapt to different equipment performance requirements. The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (5)

1. A rapid identification method for power distribution tower reverse-off information based on unmanned aerial vehicle aerial images is characterized by comprising the following steps: the method comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely, the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion;
step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information and designing a loss function to train error back propagation until the error converges to a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the unmanned aerial vehicle aerial image, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method, finally obtaining position information of normal and reverse-off power distribution towers, and completing rapid identification of reverse-off information.
2. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
Figure FDA0002284342930000021
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
3. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: in step S3, establishing a fast identification model of the power distribution tower outage information and designing a loss function, specifically:
defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation;
defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution was 13 × 13 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory, and confidence of target frames, total 7 x2 values, the rapid identification model of power distribution tower outage information will output 13 x 2+26 x2 target frames, total 1690; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
Figure FDA0002284342930000031
Figure FDA0002284342930000032
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyThe coordinates of the center point of the frame are predicted. Function of σ () is loAnd a logistic function, which normalizes the coordinates to between 0 and 1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
Figure FDA0002284342930000041
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;
Figure FDA0002284342930000042
adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes.
4. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: the step of removing the repeated target frame by using the non-maximum suppression method in step S4 is:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio with the frame in the output list higher than the threshold value as 0 according to the preset IOU threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
5. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: in the step S3, the image data input during training is performed in an online random enhancement mode, including slight image scaling between 0.9 and 1.1, random cropping by 10%, and color enhancement mode, so as to enhance the generalization capability of the network to adapt to the lightweight model.
CN201911154496.7A 2019-11-22 2019-11-22 Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image Pending CN110929646A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911154496.7A CN110929646A (en) 2019-11-22 2019-11-22 Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911154496.7A CN110929646A (en) 2019-11-22 2019-11-22 Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image

Publications (1)

Publication Number Publication Date
CN110929646A true CN110929646A (en) 2020-03-27

Family

ID=69851624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911154496.7A Pending CN110929646A (en) 2019-11-22 2019-11-22 Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image

Country Status (1)

Country Link
CN (1) CN110929646A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860245A (en) * 2020-04-01 2020-10-30 国网福建省电力有限公司 Inverted power distribution tower positioning method based on aerial tower image shot by unmanned aerial vehicle
CN112380944A (en) * 2020-11-06 2021-02-19 中国电力科学研究院有限公司 Method and system for evaluating structural state of transmission tower
CN112508076A (en) * 2020-12-02 2021-03-16 国网江西省电力有限公司建设分公司 Intelligent identification method and system for abnormal state of power engineering
CN112541455A (en) * 2020-12-21 2021-03-23 国网河南省电力公司电力科学研究院 Machine vision-based method for predicting accident of pole breakage of concrete pole of distribution network
CN113011405A (en) * 2021-05-25 2021-06-22 南京柠瑛智能科技有限公司 Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle
CN114898221A (en) * 2022-07-14 2022-08-12 灵图数据(杭州)有限公司 Tower inclination detection method and device, electronic equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325418A (en) * 2018-08-23 2019-02-12 华南理工大学 Based on pedestrian recognition method under the road traffic environment for improving YOLOv3
CN109978035A (en) * 2019-03-18 2019-07-05 西安电子科技大学 Pedestrian detection method based on improved k-means and loss function
CN110059554A (en) * 2019-03-13 2019-07-26 重庆邮电大学 A kind of multiple branch circuit object detection method based on traffic scene
CN110245644A (en) * 2019-06-22 2019-09-17 福州大学 A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning
AU2019101142A4 (en) * 2019-09-30 2019-10-31 Dong, Qirui MR A pedestrian detection method with lightweight backbone based on yolov3 network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325418A (en) * 2018-08-23 2019-02-12 华南理工大学 Based on pedestrian recognition method under the road traffic environment for improving YOLOv3
CN110059554A (en) * 2019-03-13 2019-07-26 重庆邮电大学 A kind of multiple branch circuit object detection method based on traffic scene
CN109978035A (en) * 2019-03-18 2019-07-05 西安电子科技大学 Pedestrian detection method based on improved k-means and loss function
CN110245644A (en) * 2019-06-22 2019-09-17 福州大学 A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning
AU2019101142A4 (en) * 2019-09-30 2019-10-31 Dong, Qirui MR A pedestrian detection method with lightweight backbone based on yolov3 network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭敬东等: "基于YOLO的无人机电力线路杆塔巡检图像实时检测", 《中国电力》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860245A (en) * 2020-04-01 2020-10-30 国网福建省电力有限公司 Inverted power distribution tower positioning method based on aerial tower image shot by unmanned aerial vehicle
CN112380944A (en) * 2020-11-06 2021-02-19 中国电力科学研究院有限公司 Method and system for evaluating structural state of transmission tower
CN112380944B (en) * 2020-11-06 2021-12-21 中国电力科学研究院有限公司 Method and system for evaluating structural state of transmission tower based on satellite remote sensing
CN112508076A (en) * 2020-12-02 2021-03-16 国网江西省电力有限公司建设分公司 Intelligent identification method and system for abnormal state of power engineering
CN112541455A (en) * 2020-12-21 2021-03-23 国网河南省电力公司电力科学研究院 Machine vision-based method for predicting accident of pole breakage of concrete pole of distribution network
CN112541455B (en) * 2020-12-21 2023-07-07 国网河南省电力公司电力科学研究院 Machine vision-based prediction method for reverse breaking accidents of distribution network concrete electric pole
CN113011405A (en) * 2021-05-25 2021-06-22 南京柠瑛智能科技有限公司 Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle
CN113011405B (en) * 2021-05-25 2021-08-13 南京柠瑛智能科技有限公司 Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle
CN114898221A (en) * 2022-07-14 2022-08-12 灵图数据(杭州)有限公司 Tower inclination detection method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN110929646A (en) Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image
CN111178206A (en) Building embedded part detection method and system based on improved YOLO
Wang et al. Research on image recognition of insulators based on YOLO algorithm
CN113409252B (en) Obstacle detection method for overhead transmission line inspection robot
CN110163069A (en) Method for detecting lane lines for assisting driving
CN111337789A (en) Method and system for detecting fault electrical element in high-voltage transmission line
CN115294476B (en) Edge computing intelligent detection method and device for unmanned aerial vehicle power inspection
CN116110036B (en) Electric power nameplate information defect level judging method and device based on machine vision
CN110992307A (en) Insulator positioning and identifying method and device based on YOLO
CN115761537A (en) Power transmission line foreign matter intrusion identification method oriented to dynamic characteristic supplement mechanism
CN112325785A (en) Iron tower deformation monitoring method and system based on top plane fitting
CN112861646A (en) Cascade detection method for oil unloading worker safety helmet in complex environment small target recognition scene
CN115909092A (en) Light-weight power transmission channel hidden danger distance measuring method and hidden danger early warning device
CN116385958A (en) Edge intelligent detection method for power grid inspection and monitoring
CN116206223A (en) Fire detection method and system based on unmanned aerial vehicle edge calculation
CN111241905A (en) Power transmission line nest detection method based on improved SSD algorithm
CN115082813A (en) Detection method, unmanned aerial vehicle, detection system and medium
CN112837281B (en) Pin defect identification method, device and equipment based on cascade convolution neural network
CN114119454A (en) Device and method for smoke detection of power transmission line
CN109389152B (en) Refined identification method for power transmission line falling object
CN115482473A (en) Graph convolution network model for extracting aerial image features and method for detecting abnormity
CN115100592A (en) Method and device for identifying hidden danger of external damage of power transmission channel and storage medium
CN114596273A (en) Intelligent detection method for multiple defects of ceramic substrate by using YOLOV4 network
CN113989209A (en) Power line foreign matter detection method based on fast R-CNN
CN113610191B (en) Garbage classification model modeling method and garbage classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200327

RJ01 Rejection of invention patent application after publication