CN110929646A - Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image - Google Patents
Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image Download PDFInfo
- Publication number
- CN110929646A CN110929646A CN201911154496.7A CN201911154496A CN110929646A CN 110929646 A CN110929646 A CN 110929646A CN 201911154496 A CN201911154496 A CN 201911154496A CN 110929646 A CN110929646 A CN 110929646A
- Authority
- CN
- China
- Prior art keywords
- power distribution
- frames
- frame
- tower
- distribution tower
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000009826 distribution Methods 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 35
- 238000002372 labelling Methods 0.000 claims abstract description 24
- 238000004873 anchoring Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims abstract description 5
- 238000000605 extraction Methods 0.000 claims description 20
- 230000001629 suppression Effects 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000003313 weakening effect Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 abstract description 10
- 206010063385 Intellectualisation Diseases 0.000 abstract description 3
- 238000007781 pre-processing Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 30
- 238000001514 detection method Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/13—Satellite images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/06—Electricity, gas or water supply
Abstract
The invention relates to a rapid identification method of power distribution tower reverse-off information based on an unmanned aerial vehicle aerial image, which comprises the following steps: step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle, dividing the power distribution tower into a normal type and an inverted type, generating an XML format file and preprocessing the XML format file; step S2: performing clustering analysis on the labeling frames to determine 4 anchoring frames; step S3: establishing a power distribution tower reverse-off information rapid recognition model and designing a loss function to carry out error back propagation training to obtain an optimal weight; step S4: and applying the optimal weight to a rapid identification model of the inverted information of the power distribution tower, finally obtaining the position information of the normal and inverted power distribution towers, and completing rapid identification of the inverted information. The method has the characteristics of rapidness and light weight, is used for processing massive unmanned aerial vehicle aerial image data in real time, is suitable for a mobile end or an equipment end, and promotes the intellectualization of the future unmanned aerial vehicle aerial power distribution tower image processing.
Description
Technical Field
The invention relates to the field of image recognition, in particular to a rapid recognition method for power distribution tower reverse-off information based on aerial images of unmanned aerial vehicles.
Background
The electric wire netting of southeast coastal area often receives meteorological disaster like the influence of typhoon, leads to falling off of distribution tower, seriously threatens the safety of distribution network, and the distribution tower that is not handled in time very easily produces the secondary accident, threatens personal safety, therefore the maintenance of distribution tower with patrol and examine and become an important ring of distribution network.
In recent years, the unmanned aerial vehicle technology is rapidly developed, and functions of automatic flight, active obstacle avoidance, path planning, high-definition image shooting and the like can be realized. Therefore, the unmanned aerial vehicle is widely applied to power inspection in China. Under such background, unmanned aerial vehicle technique has also been applied to in the distribution shaft tower of taking photo by plane, becomes the auxiliary means that transmission line operation and maintenance overhauld, mainly follows the information of breaking of distribution shaft tower of artifical collection in the unmanned aerial vehicle image of taking photo by plane, and then carries out the salvage of promptness. However, each flight of the unmanned aerial vehicle generates a large number of pictures, the timeliness cannot be met in an artificial mode, and the situation of misjudgment and missed judgment is easily generated under a mechanized flow, so that the intellectualization of aerial image processing of the unmanned aerial vehicle is urgently needed.
Disclosure of Invention
In view of the above, the invention aims to provide a power distribution tower reverse-off information rapid identification method based on an unmanned aerial vehicle aerial image, which has the characteristics of rapidness and light weight, is used for processing massive unmanned aerial vehicle aerial image data in real time, and is suitable for a mobile terminal or an equipment terminal, so as to promote the intellectualization of future unmanned aerial vehicle aerial power distribution tower image processing.
The invention is realized by adopting the following scheme: a rapid identification method for power distribution tower reverse-off information based on unmanned aerial vehicle aerial images comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion;
step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information and designing a loss function to train error back propagation until the error converges to a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the aerial image of the unmanned aerial vehicle, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method (NMS), obtaining the position information of the final normal and reverse-off power distribution tower, and completing rapid identification of the reverse-off information.
Further, the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the (box) labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
Further, in step S3, establishing a fast identification model of the power distribution tower outage information and designing a loss function, specifically, the contents are as follows:
defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation;
defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution was 13 × 13 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory, and confidence of target frames, total 7 × 2 values, and finally the rapid identification model of the power distribution tower reverse fault information outputs 13 × 2+26 × 2 target frames, and total 1690; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyIs the coordinate of the center point of the predicted frame. The σ () function is a logistic function, normalizing the coordinates to between 0-1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes.
Further, the step of removing the repeated target frame by using the non-maximum suppression method in step S4 is:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio higher than the threshold value with the frame in the output list as 0 according to a preset IOU (intersection ratio) threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
Further, in the step S3, the image data input during the training is performed in an online random enhancement mode, including slight scaling of the image between 0.9 and 1.1, random cropping of 10%, and color enhancement mode, so as to enhance the generalization ability of the network to adapt to the lightweight condition of the model.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention carries out K-Means clustering analysis on the marking frame, thereby obtaining a better anchoring frame.
(2) Compared with the traditional feature extraction network, the feature extraction network and the resolution network used in the invention are pruned and improved, and the required calculation parameters are less, so that the network has the advantages of light weight and low requirement on equipment performance.
(3) The loss function mode designed by the invention can effectively reduce errors in positioning and classification, and the sources of the errors are considered more comprehensively, so that the model can achieve higher accuracy.
(4) The invention designs a method for enhancing data on line, performs random multi-scale training, enhances the robustness of the network and reduces the side effect caused by light weight of the network.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Fig. 2 is a diagram of a model network structure for rapidly identifying power distribution tower reverse-off information based on an unmanned aerial vehicle aerial image according to an embodiment of the invention.
FIG. 3 is a flowchart of a K-Means clustering method according to an embodiment of the present invention.
Fig. 4 is a flowchart of a non-maximum suppression method according to an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1 and 2, the embodiment provides a method for rapidly identifying disconnection information of a power distribution tower based on an aerial image of an unmanned aerial vehicle, the method includes two modules, namely a network training module and a network detection module, and when an identification task is executed, the network detection module is switched to; when no task is executed, the network training module is switched to learn the tower information characteristics of the newly added data set so as to adapt to the situation under the more complex background around the reversed pole.
The method comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion; the LabelImg is a visual image calibration tool and can generate an XML file of the labeling information for subsequent processing by a Python script.
Step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information, designing a loss function and training error back propagation until the error converges to be within a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the aerial image of the unmanned aerial vehicle, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method (NMS), obtaining the position information of the final normal and reverse-off power distribution tower, and completing rapid identification of the reverse-off information.
As shown in fig. 3, in this embodiment, the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the (box) labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
In this embodiment, the establishing a fast identification model of the power distribution tower outage information and designing a loss function in step S3 include:
the method comprises the steps of defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network has the characteristic of light weight, specifically, the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation; convolution is used to extract the characteristics of the distribution tower in the aerial image, and max-pooling is used to reduce the amount of computation.
Defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution of 1313 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory and confidence of target frames, total 7 x2 numerical values, and finally the rapid identification model of the power distribution tower reverse failure information outputs 13 x 2+26 x2 target frames, and total 1690 target frames; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyIs the coordinate of the center point of the predicted frame. The σ () function is a logistic function, normalizing the coordinates to between 0-1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes. And the confidence error of the frame adopts a binary cross entropy loss function, and the classification error of the output frame is added into the loss function.
As shown in fig. 4, in the present embodiment, the step of removing the repeated target frame by using the non-maximum suppression method in step S4 includes:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio higher than the threshold value with the frame in the output list as 0 according to a preset IOU (intersection ratio) threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
In this embodiment, in the step S3, an online random enhancement mode is adopted for the image data input during the training period, including a mode of slightly scaling the image between 0.9 and 1.1, randomly cropping 10%, color enhancement (HSV), and the like, so as to enhance the generalization ability of the network to adapt to the lightweight condition of the model.
Preferably, in the embodiment, the problem is regarded as a recognition and positioning problem, namely a target detection problem, and by designing a lightweight and easily-edited feature extraction network and designing a network optimization loss function, the network structure automatically learns the features of the normal tower and the reverse-disconnected tower, so that recognition and detection are realized at one time. The feature extraction network is light-weight, can be edited, and requires few calculation parameters, so that the real-time requirement on various devices can be met. During model training, image data input adopts an online random enhancement mode, including rotation, shearing, scaling, mapping, color enhancement and the like, so as to adapt to the condition of model lightweight.
In this embodiment, during the training in step S3, the image data input is enhanced on-line and then, the image data input includes slight scaling between (0.9-1.1), random cropping by 10%, color enhancement (HSV), and so on, so as to enhance the generalization ability of the network to adapt to the lightweight model.
In this embodiment, the training set in step S3 includes 12000 normal and abnormal tower pictures, a random gradient descent method is adopted, the number of single training batches is set to 16, a momentum factor is superimposed and set to 0.9, a weight attenuation coefficient is set to 0.005, an initial learning rate is set to 0.001, training is performed for 100 generations, the total number of training batches is 75000, and an error is set to 10-1。
In this embodiment, the adopted feature extraction network may be tailored or modified according to hardware device conditions, so as to search out an optimal network structure.
In this embodiment, 4 anchor frames are determined by regression through a K-Means algorithm, so as to adapt to the size of the aerial tower image of the unmanned aerial vehicle, which is 13 × 13 scales and 26 × 26 scales, respectively, and each scale is two. The output is divided into K x K grids (K is 13 or 26), each of which will predict 2 boxes, relative to 2 anchor boxes.
It should be noted that most of the procedures are relatively common in the image processing field, but are necessary processes for network training. Firstly, high-definition aerial images collected by an unmanned aerial vehicle are processed, manual marking of the images is carried out, and marking content of the images comprises tower position information and tower types. Whether the network model is excellent or not is measured by taking the judgment accuracy of whether the network model can exceed that of a human being or not as a standard, so that the current labeling step is mainly manual.
Design of the loss function: the optimized direction is the negative direction of the derivative of the loss function, and the set network structure automatically learns the characteristics of the normal tower and the reverse tower, and identifies and positions the normal tower and the reverse tower. The specific loss function is designed as follows:
the calculation of the loss function includes positioning error and classification error, and the calculation of the loss function needs to be performed on the predicted central coordinate, the predicted boundary box, the predicted category and the predicted confidence coefficient. Lambda [ alpha ]coordFor the weight of the coordinate prediction error, λnoobjThe weight of the confidence prediction error.Taking 0 or 1 to judge whether the jth prediction frame in the ith grid is responsible for the object, and the frame which is intersected with the real frame of the object to be detected and has the maximum ratio is responsible for the object to be detected, namelyIs set to 1. The first part of the formula calculates the offset between each prediction frame and the actual calibration frame under each grid, and the second part calculates the error between the length and the width of the prediction frame under the grid with the actual object and the actual calibration frame, and makes one with the area of the prediction frameAnd (4) balancing. And calculating the loss functions of the prediction category and the confidence coefficient of the frame with the maximum cross-over ratio, wherein the loss functions adopt binary cross entropy, so that the gradient explosion condition in the training process is reduced, and the convergence is increased.
In addition, the network structure of feature extraction has the characteristic of light weight, and comprises 7 volume blocks, wherein each volume block is composed of convolution operation and maximum pooling operation. The detection network adopts the resolution of 13 × 13 and 26 × 26, and is composed of convolution operations of 3 × 3 and 1 × 1, and finally, the direct prediction regression generates a positioning frame, confidence coefficient and category of the object to be identified. The adopted feature extraction network can be independently designed, cut or modified according to the condition of hardware equipment, so that an optimal network structure is searched. The method for identifying the reverse-off information of the power distribution tower based on the aerial image of the unmanned aerial vehicle also meets the characteristics of rapidness and light weight, is suitable for a mobile terminal or an equipment terminal, and can meet the real-time processing and judging requirements.
Aiming at the light weight characteristic in the network model, a step of online data enhancement is specially designed and randomly used as the input of the network. The method mainly comprises the modes of rotation, cutting, scaling, mapping, color enhancement and the like, so that the generalization and model actual performance capability of the lightweight network, such as average recall ratio (map) and intersection ratio (IOU), can be enhanced. In the real-time detection, because of the light weight characteristic of the feature extraction backbone network provided by the embodiment, the number of parameters needing to be learned is small, and therefore, the video processing speed can reach 30-40 frames per second on a common CPU device.
Specifically, one specific embodiment of this embodiment is:
at first handle the aerial image of taking photo of unmanned aerial vehicle collection, through artifical distribution tower and the normal distribution tower 12000 that marks the fall-off, be 1: the proportion of 1, including position and category information, generally uses the popular LabelImg labeling software during labeling to generate the labeling information in XML format, which is convenient for processing by scripting language in the next step. Then the script language divides the training set and the testing set, reads the XML format marking information, converts the coordinate and the category, and writes the XML format marking information into the text in the txt format to form a label. At this time, the sample and label files required for network training are prepared.
And (3) carrying out cluster analysis on the marking frames aiming at the power distribution tower images of the aerial images of the specific unmanned aerial vehicle, setting the cluster number to be 4, obtaining specific 4 anchoring frames, and outputting the offset and the size of the prediction frame by the later prediction frame on the basis of the four anchoring frames.
And designing a feature extraction network and a detection network, and constructing a lightweight feature extraction network under a Pythrch frame, wherein the feature extraction network comprises seven volume blocks, each volume block comprises convolution operation and maximum pooling operation, and convolution kernels of 3 x 3 and 1 x1 are used for filtering the image, filtering irrelevant information in the aerial image and extracting useful information related to tower features. And designing a loss optimization function under a Pythrch frame, wherein the loss optimization function comprises errors of a predicted rectangular box position and an actual position, errors of a predicted certain category confidence coefficient and an actual category and the like. The detection network uses 13 × 13, 26 × 26 resolution feature networks, in which 13 × 14 and 26 × 14 feature vectors are generated, and the last dimension contains the confidence, the center coordinates of the target frame, the width and height of the target frame, and the tower type.
12000 normal and abnormal tower pictures exist in the training set, a random gradient descent method is adopted for training, the number of single training batches is set to be 16, a momentum factor is superposed and set to be 0.9, a weight attenuation coefficient is set to be 0.005, an initial learning rate is set to be 0.001, 100 training generations are performed, the number of total training batches is 75000, and an error is set to be 10-1. In the step of training the network, the step of training the online enhancement of the data set is specially designed, and the method mainly comprises the modes of rotation, cutting, scaling, mapping, color enhancement and the like, so that the generalization and model actual performance capability of the lightweight network is enhanced.
When the optimal weight is used for applying the rapid identification model, 1690 prediction frames are obtained, the non-maximum value of the mark frame meeting the category confidence coefficient threshold is required to be inhibited, and the prediction frame with the maximum confidence coefficient is obtained, so that the position and the type of the power distribution tower are obtained, and the rapid identification of the power distribution tower outage information is realized.
The feature extraction provided by the embodiment has the advantages that the required training parameters are less, the feedforward reasoning of the network, namely the identification of the reverse-off information of the power distribution tower of the aerial image, and the required calculated amount is correspondingly reduced, so that the effect of real-time response can be achieved. Meanwhile, the feature extraction network provided by the embodiment can be cut and added very easily to adapt to different equipment performance requirements. The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.
Claims (5)
1. A rapid identification method for power distribution tower reverse-off information based on unmanned aerial vehicle aerial images is characterized by comprising the following steps: the method comprises the following steps:
step S1: manually labeling the information of the power distribution tower of the aerial image of the unmanned aerial vehicle by using LabelImg software, dividing the power distribution tower into a normal type and an inverted type, and generating an XML format file; the XML format file comprises position information and category information of the tower; the XML format file is preprocessed by using a Python script, namely, the Python script normalizes the position information of the tower according to the length and the width of an aerial image, returned labels are coordinates of the center point of a tower labeling frame and the width and the height (x, y, w, h) of the tower labeling frame and are stored in a txt file, a single image corresponds to a txt file, and the XML format file is simultaneously processed according to the following steps of: 1, dividing a training set and a cross training set in proportion;
step S2: for the power distribution tower image of the unmanned aerial vehicle aerial image in the step S1, clustering analysis is carried out on the marking frames by using a K-Means algorithm to determine 4 anchoring frames;
step S3: establishing a fast identification model of the power distribution tower reverse failure information and designing a loss function to train error back propagation until the error converges to a tolerable range of 10-1And storing the optimal weight;
step S4: and applying the optimal weight to a power distribution tower reverse-off information rapid identification model of the unmanned aerial vehicle aerial image, outputting a plurality of prediction frames by the model, removing repeated target frames by using a non-maximum suppression method, finally obtaining position information of normal and reverse-off power distribution towers, and completing rapid identification of reverse-off information.
2. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: the step S2 specifically includes the following steps:
step S21: reading the coordinates of the central point, the width and the height data (x, y, w and h) of the marking frame of the power distribution tower in each sample from each txt file obtained in the step S1 by using a script, setting the clustering number k to be 4, and randomly initializing the clustering centers of four classes; the single txt file comprises (x, y, w, h) information of a plurality of power distribution tower marking frames;
step S22: calculating the distance from each sample point to each cluster center; the sample point is the power distribution tower marking frame information (x, y, w, h) contained in a single txt file, and the clustering center is the coordinate, width and height of the central point of a marking frame initialized randomly and is consistent with the data type of the sample point;
the distance function between the labeling box (x, y, w, h) and the cluster center box is set as:
d(box,centroid)=1-IOU(box,centroid)
wherein (x, y, w, h) is the central coordinate and length and width of the labeling frame, IOU is the intersection ratio between the labeling frames, and centroid is the central coordinate and length and width of the clustering central frame; the intersection ratio is the overlapping rate of the marking box1 and the clustering center box2, and is expressed by the formula:
step S23: dividing each sample into corresponding classes according to the distance;
step S24: calculating the sum of the sample points of each class, averaging the sum of the sample points, and updating the center of the class;
step S25: judging whether the difference value between the current clustering center and the previous clustering center is smaller than a set limit, if not, returning to the step S22; and if the size of the 4 anchor frames obtained by clustering is met, the sizes are respectively 13 × 13 scales and 26 × 26 scales, and each scale is two.
3. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: in step S3, establishing a fast identification model of the power distribution tower outage information and designing a loss function, specifically:
defining a feature extraction network of a power distribution tower reverse-breaking information rapid identification model, wherein the feature extraction network comprises 7 volume blocks, and each volume block comprises convolution operation and maximum pooling operation;
defining a resolution characteristic network, and defining an output format as two characteristic scale sizes: resolution was 13 × 13 and 26 × 26; the resolution feature network comprises convolution operations of 3 × 3 and 1 × 1; the output of the final distribution tower reverse-breaking information rapid identification model is a three-dimensional matrix of 13 × 14 and 26 × 14, and the final dimension comprises the center positions b of two distribution tower target frames after deformationx、byWidth and height bw、bhCategory, and confidence of target frames, total 7 x2 values, the rapid identification model of power distribution tower outage information will output 13 x 2+26 x2 target frames, total 1690; finally, converting the data into two positioning frames of the object to be recognized, target categories in the frames and confidence degrees of the frames; the output corresponding to each grid is set as two prediction frames; the conversion formula is:
bx=σ(tx)+cx
by=σ(ty)+cy
cx,cythe grid number of the grid where the center coordinate of the frame is away from the first grid at the upper left corner; t is tx,tyThe coordinates of the center point of the frame are predicted. Function of σ () is loAnd a logistic function, which normalizes the coordinates to between 0 and 1. B is finally obtainedx,byFor normalized values relative to the grid, tw,thIs the predicted width and height of the frame. p is a radical ofw,phThe width and height of the anchor frame. The conversion is the inverse process of the above formula (t)x,ty,tw,th) The center coordinates and the width and the height of the final target frame are obtained;
manufacturing a real label: the real label is in the form of n x 14, n is 13 or 26, and corresponds to the network output form; dividing the image input into 13 × 13 and 26 × 26 grids, and aiming at a certain sample in the training set, setting the center of the normalized coordinate position of the power distribution tower obtained in the step S1 to fall into a certain grid, setting the class probability of the tower in the vector correspondingly output in the grid to be 1, and setting the prediction probabilities of other grids to the tower to be 0;
establishing a loss function: through the back propagation loss function, the network structure automatically learns the characteristics of the normal tower and the reverse broken tower; the loss function is as follows:
wherein λcoordSet to 20, representing the weight of the position error, to enforce the penalty of position loss; lambda [ alpha ]noobjSetting the weight of the confidence prediction error as 1, representing the weight of the confidence error of the frame when no object exists in the output frame, and weakening the loss penalty of the frame confidence when no object exists;adding the position error when an object exists in the output frame; k represents the number of divided grids; m represents the number of output boxes.
4. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: the step of removing the repeated target frame by using the non-maximum suppression method in step S4 is:
step SA: sorting all the obtained frames into 13 × 2+26 × 2 in total, and then sorting the frames into 1690 in total according to the confidence score;
step SB: setting a confidence coefficient threshold value of 0.6, and setting the score value of a frame with the confidence coefficient lower than the threshold value as 0;
step SC: traversing all the frames, finding the object with the maximum score and the prediction frame, and adding the object and the prediction frame into an output list;
step SD: traversing the rest of the frames, and setting the scores of all the candidate object frames with the intersection ratio with the frame in the output list higher than the threshold value as 0 according to the preset IOU threshold value;
and SE: and judging whether frames with the residual confidence score being not 0 exist, if so, returning to the step SC, and otherwise, outputting the prediction frames in the list.
5. The rapid power distribution tower disconnection information identification method based on the unmanned aerial vehicle aerial image as claimed in claim 1, wherein the rapid power distribution tower disconnection information identification method comprises the following steps: in the step S3, the image data input during training is performed in an online random enhancement mode, including slight image scaling between 0.9 and 1.1, random cropping by 10%, and color enhancement mode, so as to enhance the generalization capability of the network to adapt to the lightweight model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911154496.7A CN110929646A (en) | 2019-11-22 | 2019-11-22 | Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911154496.7A CN110929646A (en) | 2019-11-22 | 2019-11-22 | Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110929646A true CN110929646A (en) | 2020-03-27 |
Family
ID=69851624
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911154496.7A Pending CN110929646A (en) | 2019-11-22 | 2019-11-22 | Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110929646A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860245A (en) * | 2020-04-01 | 2020-10-30 | 国网福建省电力有限公司 | Inverted power distribution tower positioning method based on aerial tower image shot by unmanned aerial vehicle |
CN112380944A (en) * | 2020-11-06 | 2021-02-19 | 中国电力科学研究院有限公司 | Method and system for evaluating structural state of transmission tower |
CN112508076A (en) * | 2020-12-02 | 2021-03-16 | 国网江西省电力有限公司建设分公司 | Intelligent identification method and system for abnormal state of power engineering |
CN112541455A (en) * | 2020-12-21 | 2021-03-23 | 国网河南省电力公司电力科学研究院 | Machine vision-based method for predicting accident of pole breakage of concrete pole of distribution network |
CN113011405A (en) * | 2021-05-25 | 2021-06-22 | 南京柠瑛智能科技有限公司 | Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle |
CN114898221A (en) * | 2022-07-14 | 2022-08-12 | 灵图数据(杭州)有限公司 | Tower inclination detection method and device, electronic equipment and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325418A (en) * | 2018-08-23 | 2019-02-12 | 华南理工大学 | Based on pedestrian recognition method under the road traffic environment for improving YOLOv3 |
CN109978035A (en) * | 2019-03-18 | 2019-07-05 | 西安电子科技大学 | Pedestrian detection method based on improved k-means and loss function |
CN110059554A (en) * | 2019-03-13 | 2019-07-26 | 重庆邮电大学 | A kind of multiple branch circuit object detection method based on traffic scene |
CN110245644A (en) * | 2019-06-22 | 2019-09-17 | 福州大学 | A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning |
AU2019101142A4 (en) * | 2019-09-30 | 2019-10-31 | Dong, Qirui MR | A pedestrian detection method with lightweight backbone based on yolov3 network |
-
2019
- 2019-11-22 CN CN201911154496.7A patent/CN110929646A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325418A (en) * | 2018-08-23 | 2019-02-12 | 华南理工大学 | Based on pedestrian recognition method under the road traffic environment for improving YOLOv3 |
CN110059554A (en) * | 2019-03-13 | 2019-07-26 | 重庆邮电大学 | A kind of multiple branch circuit object detection method based on traffic scene |
CN109978035A (en) * | 2019-03-18 | 2019-07-05 | 西安电子科技大学 | Pedestrian detection method based on improved k-means and loss function |
CN110245644A (en) * | 2019-06-22 | 2019-09-17 | 福州大学 | A kind of unmanned plane image transmission tower lodging knowledge method for distinguishing based on deep learning |
AU2019101142A4 (en) * | 2019-09-30 | 2019-10-31 | Dong, Qirui MR | A pedestrian detection method with lightweight backbone based on yolov3 network |
Non-Patent Citations (1)
Title |
---|
郭敬东等: "基于YOLO的无人机电力线路杆塔巡检图像实时检测", 《中国电力》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860245A (en) * | 2020-04-01 | 2020-10-30 | 国网福建省电力有限公司 | Inverted power distribution tower positioning method based on aerial tower image shot by unmanned aerial vehicle |
CN112380944A (en) * | 2020-11-06 | 2021-02-19 | 中国电力科学研究院有限公司 | Method and system for evaluating structural state of transmission tower |
CN112380944B (en) * | 2020-11-06 | 2021-12-21 | 中国电力科学研究院有限公司 | Method and system for evaluating structural state of transmission tower based on satellite remote sensing |
CN112508076A (en) * | 2020-12-02 | 2021-03-16 | 国网江西省电力有限公司建设分公司 | Intelligent identification method and system for abnormal state of power engineering |
CN112541455A (en) * | 2020-12-21 | 2021-03-23 | 国网河南省电力公司电力科学研究院 | Machine vision-based method for predicting accident of pole breakage of concrete pole of distribution network |
CN112541455B (en) * | 2020-12-21 | 2023-07-07 | 国网河南省电力公司电力科学研究院 | Machine vision-based prediction method for reverse breaking accidents of distribution network concrete electric pole |
CN113011405A (en) * | 2021-05-25 | 2021-06-22 | 南京柠瑛智能科技有限公司 | Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle |
CN113011405B (en) * | 2021-05-25 | 2021-08-13 | 南京柠瑛智能科技有限公司 | Method for solving multi-frame overlapping error of ground object target identification of unmanned aerial vehicle |
CN114898221A (en) * | 2022-07-14 | 2022-08-12 | 灵图数据(杭州)有限公司 | Tower inclination detection method and device, electronic equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110929646A (en) | Power distribution tower reverse-off information rapid identification method based on unmanned aerial vehicle aerial image | |
CN111178206A (en) | Building embedded part detection method and system based on improved YOLO | |
Wang et al. | Research on image recognition of insulators based on YOLO algorithm | |
CN113409252B (en) | Obstacle detection method for overhead transmission line inspection robot | |
CN110163069A (en) | Method for detecting lane lines for assisting driving | |
CN111337789A (en) | Method and system for detecting fault electrical element in high-voltage transmission line | |
CN115294476B (en) | Edge computing intelligent detection method and device for unmanned aerial vehicle power inspection | |
CN116110036B (en) | Electric power nameplate information defect level judging method and device based on machine vision | |
CN110992307A (en) | Insulator positioning and identifying method and device based on YOLO | |
CN115761537A (en) | Power transmission line foreign matter intrusion identification method oriented to dynamic characteristic supplement mechanism | |
CN112325785A (en) | Iron tower deformation monitoring method and system based on top plane fitting | |
CN112861646A (en) | Cascade detection method for oil unloading worker safety helmet in complex environment small target recognition scene | |
CN115909092A (en) | Light-weight power transmission channel hidden danger distance measuring method and hidden danger early warning device | |
CN116385958A (en) | Edge intelligent detection method for power grid inspection and monitoring | |
CN116206223A (en) | Fire detection method and system based on unmanned aerial vehicle edge calculation | |
CN111241905A (en) | Power transmission line nest detection method based on improved SSD algorithm | |
CN115082813A (en) | Detection method, unmanned aerial vehicle, detection system and medium | |
CN112837281B (en) | Pin defect identification method, device and equipment based on cascade convolution neural network | |
CN114119454A (en) | Device and method for smoke detection of power transmission line | |
CN109389152B (en) | Refined identification method for power transmission line falling object | |
CN115482473A (en) | Graph convolution network model for extracting aerial image features and method for detecting abnormity | |
CN115100592A (en) | Method and device for identifying hidden danger of external damage of power transmission channel and storage medium | |
CN114596273A (en) | Intelligent detection method for multiple defects of ceramic substrate by using YOLOV4 network | |
CN113989209A (en) | Power line foreign matter detection method based on fast R-CNN | |
CN113610191B (en) | Garbage classification model modeling method and garbage classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200327 |
|
RJ01 | Rejection of invention patent application after publication |