CN111553204B - Transmission tower detection method based on remote sensing image - Google Patents

Transmission tower detection method based on remote sensing image Download PDF

Info

Publication number
CN111553204B
CN111553204B CN202010279995.5A CN202010279995A CN111553204B CN 111553204 B CN111553204 B CN 111553204B CN 202010279995 A CN202010279995 A CN 202010279995A CN 111553204 B CN111553204 B CN 111553204B
Authority
CN
China
Prior art keywords
drbox
detection
priori
layer
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010279995.5A
Other languages
Chinese (zh)
Other versions
CN111553204A (en
Inventor
田桂申
宋猛
白雪娇
刘丽娟
邹睿翀
莫明飞
杨知
费香泽
李闯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Eastern Inner Mongolia Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Eastern Inner Mongolia Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, State Grid Eastern Inner Mongolia Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202010279995.5A priority Critical patent/CN111553204B/en
Publication of CN111553204A publication Critical patent/CN111553204A/en
Application granted granted Critical
Publication of CN111553204B publication Critical patent/CN111553204B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a transmission tower detection method based on a remote sensing image, which mainly utilizes multi-source high-resolution satellite remote sensing data to realize automatic detection and extraction of a transmission tower target, solves the problems of a computer vision object detection method on a satellite remote sensing image, overcomes the problems of small size, changeable tower shape and the like of a tower body in a overlooking view, and realizes automatic detection of the transmission tower in a high-resolution optical remote sensing image. Through the quantity statistical analysis of the transmission tower detection results, the normalized monitoring of the construction progress of the large-scale power grid engineering can be guided.

Description

Transmission tower detection method based on remote sensing image
Technical Field
The invention relates to the field of power grid construction engineering, in particular to a transmission tower detection method based on optical satellite remote sensing images.
Background
In order to realize the progress monitoring of the power grid construction process, the traditional technical means mainly comprise manual inspection and the like. Aiming at large-area networking engineering with extremely changeable environment and complex and dangerous topography, the traditional mode has a plurality of constraints and defects such as small monitoring range, high risk, strong environmental constraint, poor reliability and the like, has defects in various aspects such as coverage, timeliness, reliability, safety and the like, and brings challenges to the normal, efficient and refined monitoring and auditing business of the engineering.
Compared with the traditional observation means, satellite observation has incomparable huge advantages, and is particularly suitable for large-range configuration, long-distance power transmission and complicated environmental change of power grid. Specifically, the satellite remote sensing observation coverage is wide, and large-range data can be rapidly acquired; the satellite remote sensing observation has the characteristics of high information acquisition speed, short updating period and dynamic and timely performance, and is incomparable with manual field measurement and aerial photogrammetry. And secondly, satellite remote sensing observation and acquisition information is less limited by conditions, and data can be timely acquired in areas where natural conditions such as desert, mountain steep mountain, high altitude and the like are extremely severe and human beings are difficult to reach. In addition, the information quantity obtained by satellite remote sensing observation is large, and different wave bands such as visible light, ultraviolet rays, infrared rays, microwaves and the like and remote sensing instruments can be adopted according to different tasks, so that multi-dimensional mass information can be obtained all the time and all the weather. Therefore, the multi-source high-resolution satellite remote sensing data is fully utilized to realize the normalized progress monitoring of the large-area power grid engineering progress.
For the automatic detection technology of the towers in the high-resolution optical remote sensing image, the target detection is an important research direction in the computer vision, but compared with the target detection problem in the general computer vision, the target detection on the remote sensing image has more difficulties: (1) the target size is small. Limited by the resolution of the remote sensing image, the valuable target length and width to be detected is usually only tens of pixels. (2) there is a priori information about the target size. The object on the natural image has the phenomenon of near-large and far-small, the similar problems have larger size change range on different images, and the remote sensing image has definite resolution information, so the size of the object to be detected is known. (3) the target angle is random. The target object on the natural image is typically at a relatively uniform angle, such as vertical or horizontal. The remote sensing image is obtained by looking down the image of the target object, and the attitude angle of the target appears randomly.
Compared with the technical research of the state monitoring of the power transmission line body based on aviation (helicopter or unmanned aerial vehicle) means, the existing technical research of the state monitoring of the power transmission line body based on satellite remote sensing is less, because the spatial resolution of the satellite remote sensing in the past for a long time cannot meet the requirement of the state monitoring of the body. Therefore, the detection of the transmission towers can only be distinguished in the form of point targets, and the identification of the types of the towers cannot be realized. In recent years, satellite remote sensing has evolved very rapidly as an important means of earth observation. The spatial resolution of the WorldView series optical satellite reaches 0.3m, 8 spectral bands from visible light to near infrared can be obtained, and the revisiting period is about 2 days. Meanwhile, the spatial resolution of Synthetic Aperture Radar (SAR) satellites such as RADARSAT-2 reaches 1m, the ground object target information of the microwave band can be obtained, the requirement of monitoring the large-scale structural state of the power transmission line body is met, and related researches are increasingly abundant. Liao et al clearly resolved the objective and trend of transmission towers from SAR images of the flood inundation area of the river in 2003. In 2005 Zhu Junjie compared with SAR images and QuickBird images near the green bridge on the five rings of Beijing north, the towers beside the overpass are clearly distinguished in both images. The spatial resolution of SAR data used in the two researches reaches 1.25m, and the transmission tower shows obvious triangular bright spots on SAR images and can be clearly identified. On the basis, in 2007 Yang et al establish an automatic transmission tower identification model according to the high-resolution polarized SAR image, accurately extract the transmission lines in the farmland and improve the automation level of transmission tower identification. However, the research so far is in the body simple identification stage, and the fine identification of the transmission tower lacks related research work. Besides the research developed based on SAR images, the research on detecting the transmission line based on the optical satellite remote sensing data is also continuously developed. In 2015, chen et al constructed the peak characteristic of the transmission conductor in the Cluster Randon (CR) frequency domain space, which is characterized in that the transmission conductor is extracted from Quickbird optical satellite images for the first time at home and abroad, and has milestone significance. However, the CR frequency domain peak feature in this study actually corresponds to the edges of the wire and its shadows, requiring shadowing by a priori conditions to reduce the false alarm rate. This means that the CR frequency domain peak feature alone cannot be used to extract the transmission line robustly and with high accuracy, and it is necessary to further mine the transmission line body feature with an improved algorithm to achieve accurate extraction of the transmission line body.
Disclosure of Invention
The invention provides a transmission tower detection method based on remote sensing images, which aims at the problems of progress monitoring and auditing in the power grid construction process in the traditional technical means such as manual inspection and the like and aims at the normalized monitoring demand of the large-scale power grid engineering construction progress, and the transmission tower detection method based on remote sensing images comprises the following steps:
step 1, constructing a transmission tower sample space by adopting a remote sensing image as a data source to form a transmission tower detection data set;
step 2, labeling the transmission tower detection data set; labeling the remote sensing images containing the towers to improve the adaptability of the tower detection to remote sensing data with different resolutions, different wave bands and different imaging angles;
step 3, splitting the transmission tower detection data set into an ontology data set and a shadow data set;
Step 4, constructing respective multi-angle rotatable candidate frames DRBox deep learning detection models and target detection networks thereof according to the body data set and the shadow data set; the multi-angle rotatable candidate frame DRBox is a rotatable candidate frame with angle information, and can search the target to be detected at different positions of the input image;
Step 5, dividing a transmission tower detection task into two parts of tower body detection and shadow detection, and respectively carrying out target detection network training based on respective multi-angle rotatable candidate frames DRBox deep learning detection models;
step 6, cutting the remote sensing image into a plurality of image small images with preset sizes;
Step 7, for the multiple image small figures, respectively performing body detection and shadow detection of the transmission tower based on a target detection network of a multi-angle rotatable candidate frame DRBox depth learning detection model of the tower body and the shadow;
And 8, restoring longitude and latitude information from the body detection result and the shadow detection result, and fusing the longitude and latitude information and the shadow detection result to the original remote sensing image to obtain a final detection result of the transmission tower.
According to the optical satellite remote sensing image transmission tower detection method, the high-resolution remote sensing image is utilized to realize large-range transmission line tower target extraction in a complex environment. Firstly, in order to solve the problem that satellite remote sensing is possibly incomplete in acquisition of shape information of a tower body due to change of an irradiation angle, a technical scheme for comprehensively detecting and identifying types of the tower targets by combining the tower body and shadow information is provided, so that a detection frame can be respectively suitable for two situations of an oblique view angle (obvious structural characteristics of the body) and a view angle (unobvious structural characteristics of the body) under the detection frame. Secondly, in the technical scheme of detecting the target of the tower, aiming at the problems of larger length-width ratio of the target of the tower and larger influence of the target direction, the multi-angle rotatable candidate frame DRBox deep learning detection model is provided to solve the problem that the detection result is interfered by the acquired visual angle difference possibly existing in the remote sensing image, and the method is effectively suitable for the randomness of the direction of the transmission tower in the remote sensing image. By comparing the detection recall rate of the transmission tower with the manual labeling result, the detection recall rate of the transmission tower can reach 80%, the accuracy rate can reach 88.9%, the manpower investment in engineering operation is greatly reduced, the large-range transmission line progress monitoring efficiency in a complex environment is improved, and a decision basis is provided for auxiliary audit in the engineering process.
Drawings
FIG. 1 WorldView-1 satellite tower schematic (1:2000);
FIG. 2 WorldView-2 is a satellite turret schematic diagram (1:2000);
FIG. 3 is a schematic illustration of tower framing labels;
FIG. 4 DRBox is a schematic diagram of a network architecture;
FIG. 5 is a tower detection workflow diagram;
FIG. 6 is a graph of detection results of a tower of the Mongolian first-period transmission line;
Fig. 7 is a flowchart of a transmission tower detection method according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings illustrate diagrams in accordance with exemplary embodiments. These exemplary embodiments (which are also referred to herein as "examples") are described in sufficient detail to enable those skilled in the art to practice the present subject matter. The embodiments may be combined, other embodiments may be utilized, or structural and logical changes may be made without departing from the scope of the claims. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents.
As shown in fig. 7, the invention provides a transmission tower detection method based on an optical satellite remote sensing image, which comprises the following steps:
step 1, a remote sensing image is used as a data source, a transmission tower sample space is constructed, a transmission tower detection data set is formed, and the remote sensing image in the example is a high-resolution multi-source satellite remote sensing image;
different from natural images, the factors influencing remote sensing imaging are numerous, including orbit height, satellite zenith angle and other satellite inherent parameters, and are influenced by cloud, rain, fog and other weather conditions, so that different satellites image the same scene, the remote sensing data have the difference in brightness, hue and other aspects, and the imaging quality of the same satellite is different even if the same satellite passes in different time. Therefore, in order to improve the performance of the tower detection model to the greatest extent, the mainstream high-resolution satellite data at home and abroad needs to be widely used to construct a sample space with large data volume and strong representativeness, as shown in fig. 1 and 2.
And 2, marking the transmission tower detection data set, and marking multi-source remote sensing data comprising the transmission tower so as to improve the adaptability of the transmission tower detection to remote sensing data with different resolutions, different wave bands and different imaging angles.
In order to ensure the accuracy of the tower detection model to the greatest extent, the multisource remote sensing data comprising the tower needs to be marked. For the positions of the towers, a frame drawing and marking method is utilized to mark the minimum circumscribed rectangle of each tower, so that the specific position of the tower on the image is obtained. The tower frame label schematic is shown in figure 3.
From the figure, the appearance of the tower on the remote sensing image is mainly white pixels, the theme structure of the tower can be seen on the high-resolution remote sensing image, and in addition, the shadow of the tower, the shadow of the high-voltage line on the ground and the like can be seen. In order to improve the performance of the tower detection model, the positions of the towers are required to be marked by a drawing frame with a minimum external rectangle, and each marking frame is required to be ensured to completely cover the electric tower, so that the condition that omission occurs cannot occur; meanwhile, the frame mark cannot be too large, if the frame mark is too large, more backgrounds (such as farmlands, channels, trees, roads, buildings and the like) are contained in the frame of the target, so that noise of a sample becomes large, and the recognition effect of the tower detection model can be affected. Once the noise is more, the data distribution of the tower sample is directly changed greatly, which is unfavorable for training and generating a high recall and low false positive detection model.
Step 3, splitting the transmission tower detection data set into an ontology data set and a shadow data set;
Step 4, constructing respective multi-angle rotatable candidate frames DRBox deep learning detection models and target detection networks thereof according to the body data set and the shadow data set; the multi-angle rotatable candidate frame DRBox is a rotatable candidate frame with angle information, and can search the target to be detected at different positions of the input image;
Aiming at the task characteristics of high-resolution remote sensing data tower target detection and considering the subsequent tower type recognition task, the invention provides a multi-angle rotatable candidate frame DRBox for solving the object detection problem on a remote sensing image.
The proposal of the rotation detection frame can effectively adapt to the arbitrary nature of the target direction in the remote sensing image. Compared with the traditional candidate frame, the rotation detection frame has the following advantages:
1) The size and aspect ratio of the box may reflect the shape of the target object;
2) RBox contain fewer background pixels, which is beneficial for the detector to distinguish foreground and background points;
3) RBox can effectively avoid overlapping between adjacent target candidate frames, thereby being more beneficial to the detection of dense targets.
The multi-angle rotatable candidate box DRBox is a very important ring in the detection. The convolution network structure ensures that the rotatable candidate frame can search the target to be detected at different positions of the input image, and at each position, the rotatable candidate frame generates a multi-angle prediction result by rotating a series of angles, which is the largest difference between the DRBox detection method and other BBox-based detection methods in the invention. In order to reduce the total number of current frames, the aspect ratio employed in the detection is consistent with the target type. Through the multi-angle rotatable candidate frame strategy, the training network converts the detection task into a series of subtasks, and each subtask focuses on detection within a narrower angle range, so that the influence of target rotation on detection is reduced.
Aiming at the problem of random change of the target angle, the invention provides a rotatable frame DRBox with angle information. Each DRBox contained 7 parameters: the length and width of the target, the abscissa, ordinate, target angle, and the probability that the target is judged to be foreground and background of the target center point.
The constructed target detection network consists of a data input layer, a convolution layer, a priori DRBox layer, a position prediction layer, a confidence prediction layer and a loss layer, as shown in fig. 4.
A) Data Input Layer (Data Input Layer)
The training data is read in by a data input layer, and the target frame information comprises a category label of the target, a center point coordinate of the target, an angle of the target and a length and width of the target. In contrast to general detection algorithms, training data is required to provide a target frame with angular information.
B) Convolution layer (Convolution Layers)
Because VGGNet networks are deeper and have better feature extraction capability, the convolutional layer of the invention adopts the networks as a pre-training model.
C) Priori DRBox layers (Prior DRBox Layer)
A priori DRBox layer is connected after the convolutional network for generating a series of priors DRBox.
The size of the prior DRBox is preset by the input of the algorithm. When the size of the target to be detected is fixed, the target size is directly set; when the size of the target to be detected changes within a small range, the size of the priori DRBox is selected as the average value of the changing range; when the size difference of the object to be detected is large, plural sets of DRBox having different sizes are used, and DRBox of different sizes are led out from different convolution layers. For the detection of multiple classes of targets, the priors DRBox are divided into multiple groups according to the target class and the prior size, and each group DRBox is bound on a selected convolution layer according to a certain strategy.
The locations of priors DRBox are derived from the downsampled relationship of the feature map and the input image, and each priors DRBox is generated from one location on the feature map. For example, for an input size of 300×300, DRBox generated on an 8×8 feature map would cover each region of the input image with a 300/8=37.5 pixel step size. When the distance between the targets is smaller than the step length, the target will be missed, so that a proper characteristic layer is required to be determined according to the target size to ensure that the missed detection phenomenon cannot occur. Intuitively, the smaller size priors DRBox should be generated by shallower layers, while the larger size priors DRBox may be generated by deeper layers. The ability to detect targets of different sizes can be provided to the network by generating priors DRBox on different convolutional layers.
In order for the network to have the ability to detect target angles, angle information needs to be introduced in the pre-test DRBox and the angle should cover the target angles that may occur. When the targets need to distinguish the head and the tail, the angle value range of the priori DRBox is 0 to 360 degrees, and discrete value is carried out according to a preset step length L1; when the targets do not need to distinguish the head and the tail, the angle value range of the priori DRBox is 0 to 180 degrees, and discrete value is carried out according to the preset step length L2. In summary, given a particular target type and size, R priors DRBox with different angles are generated at each location in a selected convolutional layer feature map, and priors DRBox with different size targets are generated in different convolutional layer feature maps.
D) Position predicting layer (Location Prediction Layer)
The position prediction layer generates position correction information for each prior DRBox to obtain position information for prediction DRBox, which is obtained from a 3 x 3 sliding window on the feature map that leads out DRBox.
E) Confidence prediction layer (Confidence Prediction Layer)
The confidence prediction layer generates confidence information for each prior DRBox, resulting in a confidence level that predicts DRBox that belongs to each class. Because the prior DRBox is bound to the target class in the present invention, the confidence information is a 2-dimensional vector, representing the probability that DRBox belongs to the target and the background, respectively, and is obtained from a 3×3 sliding window on the feature map of the extraction DRBox.
F) Loss layer (MultiDRBox Loss Layer)
The loss layer calculates a loss function and generates an error feedback quantity of the network, and the error feedback quantity comprises four parts of input, namely, the outputs of a complex label part, a position prediction layer, a confidence prediction layer and a priori DRBox layer of the data input layer. The loss function of the loss layer is formed by weighted superposition of two parts of position loss and confidence loss, and the accuracy degree of position prediction and confidence prediction is measured respectively and can be written as the following form:
wherein L conf (x, c) is confidence loss, x is a marker variable indicating the matching condition of the real target frame and the prediction DRBox, and c is a confidence prediction vector; l loc (x, L, g) is the position loss, L is the position prediction vector, i.e. the offset of the predicted DRBox parameter and the a priori DRBox parameter, g is the offset of the real target DRBox parameter and the a priori DRBox parameter; α is a position loss weight adjustment factor, when α=0, no position loss L loc (x, L, g) is used; n is the number of DRBox on the match. When n=0, the loss layer outputs 0.
The task is divided into two parts of tower body detection and shadow detection on the basis of a single-stage detection network DRBox, and the two parts of models are respectively trained, so that the detection framework can be respectively suitable for two situations of an oblique view angle (the structural characteristics of the body are obvious) and a view angle (the structural characteristics of the body are not obvious) under the detection framework.
The working flow of the tower detection is shown in fig. 5, and specifically comprises the following steps:
Step 5, dividing a transmission tower detection task into two parts of tower body detection and shadow detection, and respectively carrying out target detection network training based on respective multi-angle rotatable candidate frames DRBox deep learning detection models, wherein the specific steps comprise:
a) Performing DRBox coding;
b) Performing DRBox matching; in the training process, the priori RBox and the true RBox are compared one by one to determine positive and negative samples;
c) Performing positive and negative sample equalization;
d) Loss calculation, predicting DRBox the difference between true DRBox, including two parts of position loss and confidence loss;
f) And (5) error back transmission.
For the adopted network model, the network training comprises DRBox coding and decoding, DRBox matching, positive and negative sample equalization, loss calculation and error feedback.
The specific process of a) DRBox codec is as follows:
in the forward propagation process, the process of obtaining the offset between the ith prior DRBox and the jth real DRBox according to the position information of the ith prior DRBox and the jth real DRBox is an encoding process, and the following formula is as follows:
in the formula, the position information of the ith priori DRBox Position information/>, of the j-th real object DRBoxOffset information/>, of ith a priori DRBox and jth real target DRBox A vector formed by offset information representing all real targets, m epsilon { cx, cy, w, h, a };
the abscissa, ordinate, width, height and angle of the i-th prior DRBox are respectively represented; /(I) Respectively representing the abscissa, the ordinate, the width, the height and the angle of the ith real object DRBox; /(I)The abscissa, ordinate, width, height and angle, respectively, represent the i-th prior DRBox offset from the j-th real target DRBox.
B) DRBox match
In the training process, the prior DRBox needs to be aligned with the real DRBox one by one to determine positive and negative samples. The proximity between two boxes is described by IoU. IoU, collectively Intersection-over-union, is defined as the ratio of the intersection to union area of two boxes. To give the angle a higher weight in the IoU calculation, the ratio is followed by the absolute value of the cosine of the difference between the two DRBox angles, and this newly defined finger is denoted RIoU. If there is no intersection between two DRBox, RIoU is 0. Since each prior DRBox in the algorithm contains class information, RIoU is also 0 if the prior DRBox and the real DRBox do not belong to the same class.
The algorithm matches a priori DRBox with true DRBox according to the following strategy:
for each true DRBox, if there is a priori DRBox that RIoU >0, then taking the largest a priori DRBox of RIoU to match the true DRBox and marking the variable as 1, otherwise marking as 0;
For each a priori DRBox, if there is a true DRBox with its RIoU greater than 0.5, then match the two DRBox and mark the variable as 1, otherwise as 0;
the set of all prior DRBox sequence numbers that can be matched with the true DRBox is recorded as Pos;
Taking the priors DRBox in the set Pos as positive samples to participate in the calculation of the loss function, and taking other priors DRBox as candidate negative samples;
The marker variable is x= { x ij},xij∈{1,0},xij, which indicates whether the ith priori DRBox is matched with the jth real DRBox, 1 indicates matching, and 0 indicates non-matching;
the RIoU represents the absolute value of IoU multiplied by the cosine of the two DRBox angle differences; the IoU represents the ratio of the intersection and union area of two boxes, i.e. IoU is used to describe the proximity between two boxes
The priors DRBox on these matches will participate as positive samples in the calculation of the loss function, the other priors DRBox as candidate negative samples.
C) Positive and negative sample equalization
Typically, the number of negative samples obtained by the strategy in DRBox matches is much larger than the number of positive samples, so the negative samples need to be scaled down to equalize the numbers. During model training we tend to focus more on negative samples that are more easily confused with positive samples. Detection algorithms based on convolutional networks typically employ HARD NEGATIVE MINING methods to reduce the negative samples. Firstly, calculating the confidence coefficient of the negative samples serving as the background through a confidence coefficient prediction layer; and secondly, sorting the background confidence degrees of all negative samples, and finally sampling the negative samples according to the sequence from low confidence degrees to high confidence degrees, so that the number of the positive samples and the number of the negative samples meet a given proportion, and obtaining a negative sample priori DRBox sequence number set Neg with balanced positive and negative samples.
D) Loss calculation
The cost loss function comprises two parts, namely position loss and confidence loss, the cost function is used for calculating the difference between the prediction DRBox and the real DRBox, each priori DRBox corresponds to one prediction DRBox, and the relevant parameters of the prediction DRBox are output by the network.
First, position lossCalculated from the following formula:
Wherein,
Second, confidence vectors for each DRBoxIs a two-element array, and represents the probability that it belongs to the foreground (1) and the background (0) respectively. Confidence loss/>Calculated from the Softmax function:
Wherein,
Finally, calculating the cost loss through a loss function of the loss layer in the following formula
Where N is the number of DRBox on the match, when n=0, the loss layer outputs 0; alpha is a position loss weight adjusting factor, alpha epsilon [0,1]; x ij is the marker variable of the matching situation of the ith priori DRBox and the jth true DRBox; confidence prediction layer output of detection network is as followsConfidence prediction information for the i-th prior DRBox,/>Is a two-element array, and represents the probability that the two-element array belongs to a foreground (1) and a background (0) respectively; the position prediction layer output is/>Coding position prediction information/>, of i priori DRBox parametersPos is a set of a priori DRBox sequence numbers that match true DRBox; GT is the set of all real DRBox serial numbers; neg is a set of negative sample priori DRBox serial numbers for positive and negative sample equalization; smooths L1 is a smooth L1 norm function.
E) Error counter-transmission
In the detection network, the error of the loss layer is reversely transmitted to the position prediction layer, the confidence prediction layer and each convolution layer, and the parameters of each layer are corrected, namely, the parameters of the network model are updated by utilizing the gradient of the loss function. The prior DRBox layer need not accept error back-propagation. Therefore, only the gradient of the loss function needs to be calculated at the loss layer:
And 6, cutting the remote sensing image into small images with preset sizes.
Step 7, for the multiple image small figures, respectively performing body detection and shadow detection of the transmission tower based on a target detection network of a multi-angle rotatable candidate frame DRBox depth learning detection model of the tower body and the shadow;
the detection network after training is completed obtains a detection result through the following steps:
a) Inputting the cut image small image into a trained deep learning detection model, and outputting a coding position prediction vector and a confidence coefficient prediction vector through a target detection network in the deep learning detection model;
b) Obtaining position information of predictions DRBox through a decoding process according to the coded position prediction vector and the priori DRBox, and associating a confidence prediction result and category information of the priori DRBox to each prediction DRBox;
c) The final detection result is obtained by non-maximum suppression (NMS).
Wherein the decoding process in step b) is as follows:
In the formula, the position information of the i-th prediction DRBox m∈{cx,cy,w,h,a},li cx,li cy,li w,li h,li a The abscissa, ordinate, width, height and angle of the ith prediction DRBox are shown, respectively; coding position prediction information/>, of i-th prior DRBoxPosition information/>, of i-th a priori DRBox
For each type of object to be detected, the NMS first ranks the output results with foreground confidence greater than a given value by confidence and sequentially fetches a given number of outputs DRBox. These DRBox are added to the output queue in turn and ensure that each time DRBox of a new output and RIoU of DRBox that has been output do not exceed a given threshold. The NMS ensures that each target is not selected by multiple DRBox at the same time and that the final output DRBox of the algorithm is the highest confidence prediction DRBox for that target.
And 8, restoring longitude and latitude information from the body detection result and the shadow detection result, and fusing the longitude and latitude information and the shadow detection result to the original remote sensing image to obtain the detection result of the transmission tower.
Testing is carried out for the conditions of Meng Dong first-stage, second-stage and third-stage work areas. The work area comprises 6 power transmission lines under construction in total of Beijing energy five-room power plant-Huarun five-room power plant lines, mongolian cylinder power plant-victory transformer substation lines. Fig. 6 shows the tower detection results of the first, second and third Mongolian phases.
According to the optical satellite remote sensing image transmission tower detection method, the high-resolution remote sensing image is utilized to realize large-range transmission line tower target extraction in a complex environment. Firstly, in order to solve the problem that satellite remote sensing is possibly incomplete in acquisition of shape information of a tower body due to change of an irradiation angle, a technical scheme for comprehensively detecting and identifying types of the tower targets by combining the tower body and shadow information is provided, so that a detection frame can be respectively suitable for two situations of an oblique view angle (obvious structural characteristics of the body) and a view angle (unobvious structural characteristics of the body) under the detection frame. Secondly, in the technical scheme of detecting the target of the tower, aiming at the problems of larger length-width ratio of the target of the tower and larger influence of the target direction, the multi-angle rotatable candidate frame DRBox deep learning detection model is provided to solve the problem that the detection result is interfered by the acquired visual angle difference possibly existing in the remote sensing image, and the method is effectively suitable for the randomness of the direction of the transmission tower in the remote sensing image. By comparing the detection recall rate of the transmission tower with the manual labeling result, the detection recall rate of the transmission tower can reach 80%, the accuracy rate can reach 88.9%, the manpower investment in engineering operation is greatly reduced, the large-range transmission line progress monitoring efficiency in a complex environment is improved, and a decision basis is provided for auxiliary audit in the engineering process.
While the present invention has been described with reference to specific exemplary embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the inventive subject matter. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims (14)

1. The transmission tower detection method based on the remote sensing image is characterized by comprising the following steps of:
step 1, constructing a transmission tower sample space by adopting a remote sensing image as a data source to form a transmission tower detection data set;
step 2, labeling the transmission tower detection data set; labeling the remote sensing images containing the towers to improve the adaptability of the tower detection to remote sensing data with different resolutions, different wave bands and different imaging angles;
step 3, splitting the transmission tower detection data set into an ontology data set and a shadow data set;
Step 4, constructing respective multi-angle rotatable candidate frames DRBox deep learning detection models and target detection networks thereof according to the body data set and the shadow data set; the multi-angle rotatable candidate frame DRBox is a rotatable candidate frame with angle information, and can search the target to be detected at different positions of the input image;
Step 5, dividing a transmission tower detection task into two parts of tower body detection and shadow detection, and respectively carrying out target detection network training based on respective multi-angle rotatable candidate frames DRBox deep learning detection models;
step 6, cutting the remote sensing image into a plurality of image small images with preset sizes;
Step 7, for the multiple image small figures, respectively performing body detection and shadow detection of the transmission tower based on a target detection network of a multi-angle rotatable candidate frame DRBox depth learning detection model of the tower body and the shadow;
And 8, restoring longitude and latitude information from the body detection result and the shadow detection result, and fusing the longitude and latitude information and the shadow detection result to the original remote sensing image to obtain a final detection result of the transmission tower.
2. The method of claim 1, wherein the remote sensing image is a high resolution multi-source satellite remote sensing image.
3. The method of claim 1, wherein the labeling of the remote sensing image including the tower in step 2 specifically comprises: and labeling the minimum circumscribed rectangle of each tower by using a frame drawing and labeling method, thereby obtaining the specific position of the tower on the image.
4. The method of claim 1, wherein step 4, the target detection network comprises a data input layer, a convolution layer, a priori DRBox layer, a position prediction layer, a confidence prediction layer, and a loss layer;
The data input layer is used for reading in training data, and a target frame of the training data comprises parameters: the length and width of the target, the abscissa, ordinate and target angle of the target center point;
the convolution layer adopts VGGNet network as a pre-training model;
the prior DRBox layer is connected after the convolution network and generates a series of prior DRBox;
A position prediction layer, configured to generate position correction information for each priori DRBox, so as to obtain position information of the prediction DRBox;
The confidence coefficient prediction layer is used for generating confidence coefficient information for each priori DRBox to obtain the confidence coefficient of the prediction DRBox belonging to each class;
The loss layer is used for calculating a loss function and generating an error counter-transmission quantity of the network; the method comprises four parts of input, namely, the output of a complex label part, a position prediction layer, a confidence prediction layer and a priori DRBox layer of a data input layer.
5. The method of claim 4, wherein the prior DRBox layer is concatenated after the convolutional network to generate a series of prior DRBox,
Wherein the size of the prior DRBox is preset by the input of the algorithm; when the size of the target to be detected is fixed, the target size is directly set; when the size of the target to be detected changes within a small range, the size of the priori DRBox is selected as the average value of the changing range; when the size difference of the targets to be detected is large, a plurality of groups of DRBox with different sizes are used, and DRBox with different sizes are led out from different convolution layers;
Aiming at the detection of multiple types of targets, the priori DRBox is divided into a plurality of groups according to the target types and the priori sizes, and each group DRBox is bound on a selected convolution layer according to a preset strategy;
the locations of priors DRBox are derived from the downsampled relationship of the selected convolutional layer feature map and the input image, each priors DRBox being generated from one location on the feature map; the smaller-sized priors RBox are generated by shallower layers, the larger-sized priors RBox are generated by deeper layers, and the network has the capability of detecting targets of different sizes by generating the priors RBox on different convolution layers;
the angle information in the priori DRBox covers a possible target angle, when the target needs to distinguish the head and the tail, the angle value range of the priori DRBox is 0 to 360 degrees, and the angle value is discretely taken according to a preset step length L1; when the targets do not need to distinguish the head and the tail, the angle value range of the priori DRBox is 0 to 180 degrees, and discrete value is carried out according to the preset step length L2.
6. The method of claim 4, wherein the confidence prediction layer, the confidence information is a 2-dimensional vector representing DRBox probabilities of belonging to the target and the background, respectively.
7. The method of claim 1, wherein the step 5 divides the transmission tower detection task into two parts of tower body detection and shadow detection, and performs the target detection network training based on the respective multi-angle rotatable candidate frame DRBox deep learning detection model, specifically includes:
step 5-1, carrying out DRBox coding;
Step 5-2, carrying out DRBox matching; in the training process, the priori RBox and the true RBox are compared one by one to determine positive and negative samples;
Step 5-3, balancing positive and negative samples;
Step 5-4, loss calculation, predicting the difference between DRBox and the true DRBox, including two parts of position loss and confidence loss;
and 5-5, error back transmission.
8. The method of claim 7, wherein the encoding of step 5-1, DRBox, specifically comprises:
in the forward propagation process, the process of obtaining the offset between the ith prior DRBox and the jth real DRBox according to the position information of the ith prior DRBox and the jth real DRBox is an encoding process, and the following formula is as follows:
in the formula, the position information of the ith priori DRBox Position information/>, of the j-th real object DRBoxOffset information/>, of ith a priori DRBox and jth real target DRBoxM is { cx, cy, w, h, a }, cx, cy, w, h, a respectively representing DRBox abscissa, ordinate, width, height and angle; /(I)And representing vectors formed by the offset information of all real targets.
9. The method of claim 8, wherein the matching of step 5-2 to DRBox specifically comprises:
step 5-2-1, if there is no intersection between two DRBox, RIoU is 0; if prior DRBox and true DRBox do not belong to the same class, RIoU is also 0;
step 5-2-2, performing bidirectional matching on the priori DRBox and the true DRBox, wherein the specific steps include:
For each true DRBox, if there is a priori DRBox that RIoU >0, then taking the largest a priori DRBox of RIoU to match the true DRBox and marking the marker variable as 1, otherwise marking as 0;
for each a priori DRBox, if there is a true DRBox with RIoU >0.5, then matching each of the true DRBox with a priori DRBox and marking the variable as 1, otherwise as 0;
the set of all prior DRBox sequence numbers that can be matched with the true DRBox is recorded as Pos;
Taking the priors DRBox in the set Pos as positive samples to participate in the calculation of the loss function, and taking other priors DRBox as candidate negative samples;
The marker variable is x= { x ij},xij∈{1,0},xij, which indicates whether the ith priori DRBox is matched with the jth real DRBox, 1 indicates matching, and 0 indicates non-matching;
The RIoU represents the absolute value of IoU times the cosine of the two DRBox angle differences, and the IoU represents the ratio of the intersection to union area of the two boxes.
10. The method of claim 8, wherein the step 5-3 of performing positive and negative sample equalization specifically comprises:
Step 5-3-1, calculating the confidence coefficient of the negative sample as the background through the confidence coefficient prediction layer,
Step 5-3-2, sorting the background confidence of all negative samples,
And 5-3-3, sampling negative samples according to the order of the confidence from low to high, so that the number of the positive samples and the number of the negative samples meet a given proportion, and obtaining a set Neg of negative sample priori DRBox sequence numbers of balanced positive and negative samples.
11. The method according to claim 8, wherein the step 5-4, loss calculation, specifically comprises:
Calculating cost loss by loss function of loss layer in the following way
Wherein the position loss is calculated by
Wherein the confidence loss is calculated by a Softmax function of the formula
Where N is the number of DRBox on the match, when n=0, the loss layer outputs 0; alpha is a position loss weight adjusting factor, alpha epsilon [0,1]; x ij is the marker variable of the matching situation of the ith priori DRBox and the jth true DRBox; confidence prediction layer output of detection network is as follows Confidence prediction information for the i-th prior DRBox,/>Is a two-element array, and represents the probability that the two-element array belongs to a foreground (1) and a background (0) respectively; the position prediction layer output is/>Coding position prediction information/>, of i priori DRBox parametersPos is a set of a priori DRBox sequence numbers that match true DRBox; GT is the set of all real DRBox serial numbers; neg is a set of negative sample priori DRBox serial numbers for positive and negative sample equalization; /(I)Is a smooth L1 norm function.
12. The method of claim 11, wherein the error back-propagation in step 5-5 specifically comprises:
In the detection network, the error of the loss layer is reversely transmitted to a position prediction layer, a confidence prediction layer and each convolution layer, and parameters of each layer are corrected, namely, parameters of a network model are updated by utilizing gradients of the loss function, and the method specifically comprises the steps of calculating the gradients of the loss function in the loss layer:
13. The method according to claim 1, wherein step 7, for the multiple image minidrawings, performs body detection and shadow detection of the transmission tower respectively based on the target detection network of the multi-angle rotatable candidate frame DRBox depth learning detection model of the tower body and the shadow, and specifically includes;
Step 7-1, inputting the cut image small image into a trained deep learning detection model, and outputting a coding position prediction vector and a confidence coefficient prediction vector through a target detection network in the deep learning detection model;
Step 7-2, obtaining the position information of the prediction DRBox through a decoding process according to the coding position prediction vector and the priori DRBox, and associating a confidence prediction result and the category information of the priori DRBox to each prediction DRBox;
And 7-3, obtaining a final detection result through non-maximum value inhibition NMS.
14. The method of claim 13, wherein in step 7-2, obtaining the predicted DRBox location information from the location prediction vector and the prior DRBox by a decoding process comprises:
In the formula, the position information of the i-th prediction DRBox m∈{cx,cy,w,h,a},li cx,li cy,li w,li h,li a The abscissa, ordinate, width, height and angle of the ith prediction DRBox are shown, respectively; coding position prediction information/>, of i-th prior DRBoxPosition information/>, of i-th a priori DRBox
CN202010279995.5A 2020-04-10 2020-04-10 Transmission tower detection method based on remote sensing image Active CN111553204B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010279995.5A CN111553204B (en) 2020-04-10 2020-04-10 Transmission tower detection method based on remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010279995.5A CN111553204B (en) 2020-04-10 2020-04-10 Transmission tower detection method based on remote sensing image

Publications (2)

Publication Number Publication Date
CN111553204A CN111553204A (en) 2020-08-18
CN111553204B true CN111553204B (en) 2024-05-28

Family

ID=72005678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010279995.5A Active CN111553204B (en) 2020-04-10 2020-04-10 Transmission tower detection method based on remote sensing image

Country Status (1)

Country Link
CN (1) CN111553204B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112883840B (en) * 2021-02-02 2023-07-07 中国人民公安大学 Power transmission line extraction method based on key point detection
CN113033446B (en) * 2021-04-01 2024-02-02 辽宁工程技术大学 Transmission tower identification and positioning method based on high-resolution remote sensing image
CN113537142A (en) * 2021-08-03 2021-10-22 广东电网有限责任公司 Monitoring method, device and system for construction progress of capital construction project and storage medium
CN113804161A (en) * 2021-08-23 2021-12-17 国网辽宁省电力有限公司大连供电公司 Method for detecting inclination state of transmission tower

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN110751077A (en) * 2019-10-15 2020-02-04 武汉大学 Optical remote sensing picture ship detection method based on component matching and distance constraint
CN110796037A (en) * 2019-10-15 2020-02-14 武汉大学 Satellite-borne optical remote sensing image ship target detection method based on lightweight receptive field pyramid
CN110910440A (en) * 2019-09-30 2020-03-24 中国电力科学研究院有限公司 Power transmission line length determination method and system based on power image data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145931B (en) * 2018-09-03 2019-11-05 百度在线网络技术(北京)有限公司 Object detecting method, device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019192397A1 (en) * 2018-04-04 2019-10-10 华中科技大学 End-to-end recognition method for scene text in any shape
CN110910440A (en) * 2019-09-30 2020-03-24 中国电力科学研究院有限公司 Power transmission line length determination method and system based on power image data
CN110751077A (en) * 2019-10-15 2020-02-04 武汉大学 Optical remote sensing picture ship detection method based on component matching and distance constraint
CN110796037A (en) * 2019-10-15 2020-02-14 武汉大学 Satellite-borne optical remote sensing image ship target detection method based on lightweight receptive field pyramid

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种降雨诱发滑坡灾害下输电杆塔的监测与预警方法;陈强;王建;熊小伏;冯长有;马超;;电力系统保护与控制;20200201(第03期);全文 *
基于特征金字塔模型的高分辨率遥感图像船舶目标检测;周慧;严凤龙;褚娜;陈澎;;大连海事大学学报;20191115(第04期);全文 *

Also Published As

Publication number Publication date
CN111553204A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CN111553204B (en) Transmission tower detection method based on remote sensing image
CN109919875B (en) High-time-frequency remote sensing image feature-assisted residential area extraction and classification method
Sohn et al. Automatic powerline scene classification and reconstruction using airborne lidar data
CN112149547B (en) Remote sensing image water body identification method based on image pyramid guidance and pixel pair matching
CN114419825B (en) High-speed rail perimeter intrusion monitoring device and method based on millimeter wave radar and camera
CN111310681B (en) Mangrove forest distribution remote sensing extraction method integrated with geoscience knowledge
Zhang et al. Self-attention guidance and multi-scale feature fusion based uav image object detection
CN113920436A (en) Remote sensing image marine vessel recognition system and method based on improved YOLOv4 algorithm
CN113838064B (en) Cloud removal method based on branch GAN using multi-temporal remote sensing data
CN115641514A (en) Pseudo visible light cloud map generation method for night sea fog monitoring
Zhang et al. Nearshore vessel detection based on Scene-mask R-CNN in remote sensing image
CN109829426A (en) Railway construction temporary building monitoring method and system based on high score remote sensing image
CN115908894A (en) Optical remote sensing image ocean raft type culture area classification method based on panoramic segmentation
Guo et al. Correction of sea surface wind speed based on SAR rainfall grade classification using convolutional neural network
CN115018285A (en) Storm surge and sea wave fine early warning system and early warning method
CN116229287B (en) Remote sensing sub-pixel epidemic wood detection method based on complex woodland environment
CN116503750A (en) Large-range remote sensing image rural block type residential area extraction method and system integrating target detection and visual attention mechanisms
CN116152666A (en) Cross-domain remote sensing image self-adaptive learning method considering ground feature weather heterogeneity
Mandroux et al. Wind turbine detection on sentinel-2 images
Li et al. Recognition algorithm for deep convective clouds based on FY4A
Huang et al. Shadow Information-Based Slender Targets Detection Method in Optical Satellite Images
Liu et al. Intelligent identification of landslides in loess areas based on the improved YOLO algorithm: a case study of loess landslides in Baoji City
Tian et al. Research on Monitoring and Auxiliary Audit Strategy of Transmission Line Construction Progress Based on Satellite Remote Sensing and Deep Learning
Shi et al. LSKF-YOLO: Large selective kernel feature fusion network for power tower detection in high-resolution satellite remote sensing images
Wang et al. Spartina alterniflora classification at patch scale based on feature fusion and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant