CN117036929A - Power transmission tower identification method and system based on shadow assistance and rotating frame detection - Google Patents

Power transmission tower identification method and system based on shadow assistance and rotating frame detection Download PDF

Info

Publication number
CN117036929A
CN117036929A CN202310766577.2A CN202310766577A CN117036929A CN 117036929 A CN117036929 A CN 117036929A CN 202310766577 A CN202310766577 A CN 202310766577A CN 117036929 A CN117036929 A CN 117036929A
Authority
CN
China
Prior art keywords
feature map
feature
shadow
transmission tower
power transmission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310766577.2A
Other languages
Chinese (zh)
Inventor
周宇
胡方舟
谈元鹏
支妍力
莫文昊
安康
薄海旺
马小新
刘沛轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
State Grid Jiangxi Electric Power Co ltd
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Jiangxi Electric Power Co ltd, State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI filed Critical State Grid Jiangxi Electric Power Co ltd
Priority to CN202310766577.2A priority Critical patent/CN117036929A/en
Publication of CN117036929A publication Critical patent/CN117036929A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a method and a system for identifying a power transmission tower based on shadow assistance and rotating frame detection, wherein for an input remote sensing image, a lightweight characteristic extraction network sequentially extracts corresponding characteristic images through each sub-module, and a bidirectional weighting characteristic fusion network is adopted to fuse the characteristic images with each resolution; determining the position information, the shadow position information and the shadow category information of the power transmission tower through rotating frame detection by utilizing the fused feature map; matching the power transmission tower and the shadow is realized by calculating the shortest Euclidean distance between each pair of shadows and the corner points of the power transmission tower, and the matched shadow categories are given to the power transmission tower, so that the classification of the power transmission tower categories is realized. The method and the device can solve the problem that the current detection algorithm usually detects the power transmission tower as a unified class and does not distinguish specific classes of the power transmission tower, and the classification of the power transmission tower is realized in a shadow auxiliary mode.

Description

Power transmission tower identification method and system based on shadow assistance and rotating frame detection
Technical Field
The application belongs to the technical field of power equipment target identification, and particularly relates to a power transmission tower identification method and system based on shadow assistance and rotating frame detection.
Background
The transmission tower is a main supporting structure of electric components such as cables in overhead lines, and is one of the most common and important electric power assets of power supply enterprises. The detection of the transmission tower will provide useful information for asset calculation and power infrastructure project management for the power supply company. However, the traditional manual detection method has the defects of low efficiency, high labor intensity and the like, and the unmanned aerial vehicle has small inspection shooting range and limited flight time, and can not meet the requirement of large-range transmission tower detection. The high-resolution satellite remote sensing imaging is not limited by the topography and the landform, has wide shooting range and high shooting precision, and can be used for detecting the power transmission tower in a large range.
The development of target detection technology based on the deep convolutional neural network improves the capability of detecting the power transmission tower by utilizing the optical remote sensing image. There have been related studies to detect transmission towers in an optical remote sensing image using various target detection methods based on DCNN, however most studies do not distinguish the categories of transmission towers, but detect all transmission towers as one unified category. In practical application, the power transmission towers are various in variety and have obvious differences in structure, appearance and size. Therefore, accurate detection of the class of transmission towers is also necessary. The optical remote sensing image is from the high altitude, the shape information of the power transmission tower is incomplete, and the identification of the type of the power transmission tower faces great challenges.
The research on how to extract key information from the high-resolution satellite remote sensing image by using the deep convolutional neural network to divide the tower type and position the tower has important practical value.
The existing power transmission tower identification method still has a series of problems: 1. the common target detection network detects targets through a horizontal frame, and for long, narrow and random objects, the horizontal frame cannot accurately position the targets; 2. in the existing method, all the power transmission towers are detected as a unified category, and the specific categories of the power transmission towers are not distinguished. 3. The optical remote sensing image is derived from the high altitude, the shape information of the power transmission tower is incomplete, and the identification of the type of the power transmission tower faces great challenges.
Disclosure of Invention
Aiming at the limitations of the existing method, the application provides a power transmission tower identification method based on shadow assistance and rotating frame detection, which is used for determining power transmission tower position information, shadow position information and shadow category information through rotating frame detection based on a target detection model; and matching the power transmission tower and the shadow by calculating the shortest Euclidean distance between each pair of shadows and the corner points of the power transmission tower so as to effectively detect the category and the position of the power transmission tower.
The present application is so achieved. A power transmission tower identification method based on shadow assistance and rotating frame detection comprises the following steps:
step one: inputting the remote sensing image to be detected into a lightweight characteristic extraction network, and enabling the lightweight characteristic extraction network to sequentially pass through the submodules M 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Extracting corresponding feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Sequentially reducing the space size, sequentially reducing the included image detail information, and sequentially increasing the included advanced semantic information; take feature map S 3 Feature map S 4 And feature map S 5 As a feature extraction network output feature map, and extracting the feature map S 5 Downsampling twice to obtain a feature map S 6 And feature map S 7
Step two: fusing the feature graphs of all resolutions by adopting a bidirectional weighted feature fusion network;
step three: determining the position information, the shadow position information and the shadow category information of the power transmission tower through rotating frame detection by utilizing the fused feature map;
step four: matching the power transmission tower and the shadow is realized by calculating the shortest Euclidean distance between each pair of shadows and the corner points of the power transmission tower, and the matched shadow categories are given to the power transmission tower, so that the classification of the power transmission tower categories is realized.
Further preferably, the process of fusing the feature graphs of each resolution by using the bidirectional weighted feature fusion network is as follows: will beFeature map S 7 Up-sampled and feature map S 6 Splicing on the channel dimension, and then carrying out convolution on the spliced characteristic images by using convolution check to obtain a characteristic image S' 6 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 6 Up-sampled and feature map S 5 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 5 Up-sampled and feature map S 4 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 4 Up-sampled and feature map S 3 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 3 The method comprises the steps of carrying out a first treatment on the surface of the After obtaining the characteristic diagram S' 4 And S' 3 After that, the feature map S' 3 Downsampled and feature map S' 4 And feature map S 4 Splicing, and convoluting the spliced feature images by convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Feature map S' 4 Downsampled and feature map S' 5 And feature map S 5 Splicing, and convoluting the spliced feature images by convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Feature map S' 5 Downsampled and feature map S' 6 And feature map S 6 Splicing, and convoluting the spliced feature images by convolution check to obtain a feature image S' 6 The feature map S' is then displayed " 6 Downsampled and feature map S 7 Splicing, and convoluting the spliced feature images by convolution check to obtain a feature image S' 7 Finally, the characteristic diagram S' 7 Feature map S' 6 Feature map S' 5 Feature map S' 4 And a characteristic diagram S' 3 And outputting the characteristic images as the fused characteristic images.
More preferably, the fused feature map S 'is used' 7 Feature map S' 6 Feature map S' 5 Feature map S' 4 And a characteristic diagram S' 3 And predicting the position, the shadow position and the shadow category of the power transmission tower through the 1X 1 convolution layer.
Further preferably, the transmission tower is located at a position of a femaleImage position (c) x ,c y W, h, θ) represents c x 、c y W, h, and θ denote the abscissa of the center point of the rotation frame, the width and height of the rotation frame, and the rotation angle, respectively.
Further preferably, (x) i ,y i ) i=1,2,3,4 Representing the position coordinates of 4 corner points of the power transmission tower, (x) j ,y j ) j=1,2,3,4 Representing the position coordinates of 4 angular points of the shadow, and calculating the shortest Euclidean distance between the power transmission tower and the shadowAnd if the shortest distance between the power transmission tower and the shadow is smaller than the threshold value, the target shadow is considered to be matched.
The application also provides a power transmission tower identification system based on shadow assistance and rotating frame detection, which comprises a target detection model and a power transmission tower shadow matching module, wherein the target detection model consists of a light weight feature extraction network, a bidirectional weighting feature fusion network and a prediction network which are integrated with an attention mechanism, and the light weight feature extraction network which is integrated with the attention mechanism is used for extracting a feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Sequentially reducing the space size, sequentially reducing the included image detail information, and sequentially increasing the included advanced semantic information; take feature map S 3 Feature map S 4 And feature map S 5 As a feature extraction network output feature map, and extracting the feature map S 5 Downsampling twice to obtain a feature map S 6 And feature map S 7 Then, the two-way weighted feature fusion network is used for feature graphs S of different scales 3 Feature map S 4 Feature map S 5 Feature map S 6 And feature map S 7 Fusing, wherein the prediction network carries out rotating frame detection based on the fused feature map, and determines the position information, shadow position information and shadow category information of the power transmission tower; the power transmission tower shadow matching module realizes the power transmission tower and the shadow by calculating the shortest Euclidean distance between each pair of shadows and the power transmission tower corner pointsAnd (5) matching the shadows.
Further preferably, the lightweight feature extraction network integrating the attention mechanism comprises four sub-modules for feature extraction, namely sub-module M, which are sequentially connected in series 2 Submodule M 3 Submodule M 4 Sum submodule M 5 The method comprises the steps of carrying out a first treatment on the surface of the Sub-module M 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Similar in structure, submodule M 2 Comprising 3 groups of residual blocks and attention modules, and a submodule M 3 Comprising 4 groups of residual blocks and attention modules, and a submodule M 4 Comprises 6 groups of residual blocks and attention modules, and a submodule M 5 The system comprises 3 groups of residual blocks and attention modules, wherein the attention modules consist of a spatial attention module and a channel attention module.
Further preferably, each residual block is mapped to its own input feature mapFirst pass through a 1 x 1 convolution kernelDimension reduction is performed by convolving the block with 32 3 x 3 blocks +.>Extracting characteristics, splicing all grouping convolution output results together in parallel according to channels, and performing convolution of 1 multiplied by 1 +.>Up-dimension, finally adding residual connection to obtain self output characteristic diagram +.>Wherein C is in The number of channels representing the input feature map, C out The number of channels representing the output characteristic diagram, C o ' ut Represents the number of channels of the packet convolution, H represents the feature map, and W represents the feature map width.
Further preferably, the spatial attention module and the channel attention module are utilizedLine feature enhancement: outputting a feature mapFirst sum channel attention map M c ∈R C×1×1 Channel-by-channel multiplication to obtain a fused channel attention feature mapFusion channel attention profile F o ' ut Re-sum spatial attention map M s ∈R 1×H×W Multiplying the two parts by each other to obtain fusion, and obtaining a fusion channel attention and spatial attention characteristic diagram +.>
Further preferably, the target detection model is trained using an attenuation hot start random gradient descent training strategy.
Still preferably, the present application provides a nonvolatile computer storage medium storing computer executable instructions that can perform the above-described power transmission tower identification method based on shadow assistance and rotating frame detection.
Still further preferably, the present application provides a computer program product comprising a computer program stored on a non-volatile computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method of power transmission tower identification based on shadow assistance and rotating frame detection.
Still preferably, the present application provides an electronic apparatus, comprising: the electronic device further includes one or more processors and memory: an input device and an output device; the processor executes various functional applications and data processing of the server by running nonvolatile software programs, instructions and modules stored in the memory, namely the power transmission tower identification method based on shadow assistance and rotating frame detection is realized.
The application provides a bidirectional weighted feature fusion network which fuses feature graphs with different resolutions to enhance the expression capacity of the feature graphs by considering the difference of the feature graphs with different resolutions output by the feature extraction network. Specifically, the low-resolution feature images output by the feature extraction network are up-sampled and then fused with adjacent high-resolution feature images in sequence from low resolution to high resolution, then the high-resolution feature images are down-sampled and then fused with adjacent low-resolution feature images in sequence from high resolution to low resolution, and finally the fused multi-scale feature images are obtained. In order to effectively avoid the network from being in local optimum in the training process, the application provides a damping hot-start random gradient descent training strategy.
Most of the remote sensing images are in a overlooking view, the power transmission tower body is usually smaller in the images, the shape of the power transmission tower body cannot be completely displayed in the overlooking view, and the shadow of the power transmission tower in the overlooking view can completely retain the shape information of the power transmission tower, so that the identification of the type of the power transmission tower can be assisted by the shadow type of the power transmission tower. Aiming at the problem that the detection precision of the rotated fixed-shape power equipment is low in the target detection method of the horizontal rectangular frame, the application provides a method for realizing high-precision identification of the power transmission tower and the shadow thereof in the high-resolution remote sensing image based on the detection of the rotating frame. Aiming at the current detection algorithm, the power transmission tower is usually used as a unified class to be detected, the problem of distinguishing specific classes of the power transmission tower is solved, targets based on Euclidean distance are matched with target shadows, and classification of the power transmission tower is realized in a shadow-assisted mode.
Drawings
FIG. 1 is a flow of an identification algorithm.
Fig. 2 is a diagram showing a conventional feature extraction network and a lightweight feature extraction network.
Fig. 3 is a diagram of a two-way weighted feature fusion network.
Fig. 4 is a shadow-assist power tower detection flow diagram.
Detailed Description
The present application will be explained in further detail below.
The power transmission tower identification system based on shadow assistance and rotating frame detection comprises a target detection model and a power transmission tower shadow matching module, wherein the target detection model consists of a light weight feature extraction network, a bidirectional weighting feature fusion network and a prediction network which are integrated with an attention mechanism, feature images are extracted through the light weight feature extraction network which is integrated with the attention mechanism, then the bidirectional weighting feature fusion network is used for fusing the feature images with different scales, the prediction network is used for rotating frame detection based on the fused feature images, and power transmission tower position information, shadow position information and shadow category information are determined; the power transmission tower shadow matching module is used for matching the power transmission tower and the shadow by calculating the shortest Euclidean distance between each pair of shadows and the power transmission tower corner points.
The light-weight feature extraction network integrating the attention mechanism comprises four sub-modules for feature extraction, namely sub-module M, which are sequentially connected in series 2 Submodule M 3 Submodule M 4 Sum submodule M 5 The method comprises the steps of carrying out a first treatment on the surface of the Sub-module M 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Similar in structure, submodule M 2 Comprising 3 groups of residual blocks and attention modules, and a submodule M 3 Comprising 4 groups of residual blocks and attention modules, and a submodule M 4 Comprises 6 groups of residual blocks and attention modules, and a submodule M 5 The system comprises 3 groups of residual blocks and attention modules, wherein the attention modules consist of a spatial attention module and a channel attention module.
Input feature map of each residual block to itselfFirst by a convolution kernel of 1 x 1 +.>Dimension reduction is performed by convolving the block with 32 3 x 3 blocks +.>Extracting characteristics, splicing all grouping convolution output results together in parallel according to channels, and performing convolution of 1 multiplied by 1 +.>Up-dimension, finally adding residual connection to obtain self output characteristic diagram +.>Wherein C is in The number of channels representing the input feature map, C out The number of channels representing the output characteristic diagram, C o ' ut Represents the number of channels of the packet convolution, H represents the feature map, and W represents the feature map width. In comparison with the conventional residual structure, the parameter amount of the convolution kernel in the structure is (C out ×3×3×C o ' ut ) And/32, which is 1/32 of the parameter quantity of the traditional residual structure, so that the parameter quantity is greatly reduced while the width of the network is greatly increased, and the network characterization capability is improved and the light weight of the network structure is realized. Output feature map considering residual block->The application uses the space attention module and the channel attention module to enhance the characteristics in the differences of the characteristic dimension and the space position distribution. Specifically, a feature map is outputFirst sum channel attention map M c ∈R C×1×1 Channel-by-channel multiplication to obtain a fused channel attention feature mapFusion channel attention profile F' out Re-sum spatial attention map M s ∈R 1×H×W Multiplying the two parts by each other to obtain fusion, and obtaining a fusion channel attention and spatial attention characteristic diagram +.>Thereby concentrating the network on the extraction of more important information. Fig. 2 is a diagram showing a conventional feature extraction network and a lightweight feature extraction network.
Referring to fig. 1, 3 and 4, the application provides a power transmission tower identification method based on shadow assistance and rotating frame detection, which comprises the following steps:
step one, feature extraction: inputting the remote sensing image I to be detected into a lightweight characteristic extraction network, and sequentially passing through sub-modules M by the lightweight characteristic extraction network 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Extracting corresponding feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 I.e. Representing submodule M 2 Is the input-to-output mapping of +.>Representing submodule M 3 Is the input-to-output mapping of +.>Representing submodule M 4 Is the input-to-output mapping of +.>Representing submodule M 5 A mapping relationship of input to output; feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Sequentially reducing the space size, sequentially reducing the included image detail information, and sequentially increasing the included advanced semantic information; take feature map S 3 Feature map S 4 And feature map S 5 As a feature extraction network output feature map, and extracting the feature map S 5 Downsampling twice to obtain a feature map S 6 And S is 7 . Map S of the characteristics 5 Downsampling is to obtain a feature map with a smaller resolution, expanding the range of target sizes that the network can detect.The smaller the resolution of the feature map, the larger the receptive field each pixel therein corresponds to the original map, and the larger size of the object can be detected.
Step two, feature fusion: considering the difference of the feature images with different resolutions output by the lightweight feature extraction network in the expression capacity, a bidirectional weighted feature fusion network is adopted to fuse the feature images with different resolutions so as to enhance the expression capacity of the feature images. Specifically, referring to fig. 3, a feature map S 7 Up-sampled and feature map S 6 Splicing on the channel dimension, and then convolving the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 6 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 6 Up-sampled and feature map S 5 Splicing on the channel dimension, and convoluting the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 5 Up-sampled and feature map S 4 Splicing on the channel dimension, and convoluting the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 4 Up-sampled and feature map S 3 Splicing on the channel dimension, and convoluting the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 3 The method comprises the steps of carrying out a first treatment on the surface of the After obtaining the characteristic diagram S' 4 And' 3 After that, the feature map S' 3 Downsampled and feature map S' 4 And feature map S 4 Splicing, and convolving the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Feature map S' 4 Downsampled and feature map S' 5 And feature map S 5 Splicing, and convolving the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Feature map S' 5 Downsampled and feature map S' 6 And feature map S 6 Splicing, and convolving the spliced feature images by using a 3X 3 convolution check to obtain a feature image S' 6 The feature map S' is then displayed " 6 Downsampled and feature map S 7 Splicing, and convolving the spliced feature images by using a convolution check of 3 multiplied by 3 to obtain a feature image S' 7 Finally, the characteristic diagram S 7 'feature map S' 6 Feature map S' 5 Special (special)Sign S' 4 And a characteristic diagram S' 3 And outputting the characteristic images as the fused characteristic images.
Step three, detecting a rotating frame: by using the fused characteristic diagram S' 7 Feature map S' 6 Feature map S' 5 Feature map S' 4 And a characteristic diagram S' 3 And determining the position information, the shadow position information and the shadow category information of the power transmission tower through rotating frame detection. Based on the prediction network, the power transmission tower position, the shadow position and the shadow category are obtained through 1X 1 convolution layer prediction, and the power transmission tower position and the shadow position are used (c) x ,c y W, h, θ) represents c x 、c y W, h, and θ represent the abscissa of the center point of the rotating frame (i.e., the detection frame), the width and height of the rotating frame, and the angle of rotation, respectively. In the training process, the output target position information and the class information respectively calculate errors with the real target position information and the real class information, and network parameters are updated through an error back propagation algorithm.
And step four, matching the target based on the Euclidean distance with the target shadow. The power transmission towers and shadows detected by using the target detection model are usually independent, and a plurality of power transmission towers and shadows usually exist in one optical remote sensing image. Specifically, (x) i ,y i ) i=1,2,3,4 Representing the position coordinates of 4 corner points of the power transmission tower, (x) j ,y j ) j=1,2,3,4 Representing the position coordinates of 4 angular points of the shadow, and calculating the shortest Euclidean distance between the power transmission tower and the shadowAnd if the shortest distance between the power transmission tower and the shadow is smaller than the threshold value, the target shadow is matched, and the matched shadow class is given to the power transmission tower, so that the classification of the power transmission tower is realized.
In this embodiment, in order to effectively avoid the target detection model from being trapped into a local optimum in the training process, an attenuation hot start random gradient descent training strategy is adopted to train the target detection model. In particular, at the original randomGradient descent strategyWherein W is t And W is t+1 Respectively representing parameters of the network in the t-th round and the t+1 round, wherein L represents the learning rate, N represents the number of images participating in training in the t-th round, and L i The loss of the ith sample is represented, hot start training is carried out by gradually increasing the learning rate l to a preset initial learning rate in the first 5 rounds of training, the learning rate l is gradually reduced according to a cosine function by taking 50 rounds of training as a period, and each 50 rounds of training takes the initial learning rate multiplied by an attenuation coefficient of 0.8 as an initial value of the learning rate of the next period. The method helps the target detection model to jump out of the local optimal solution in a sudden increase learning rate mode, and finds a path leading to the global optimal solution.
In other embodiments, a non-volatile computer storage medium is provided, the computer storage medium storing computer executable instructions that are capable of performing the method for identifying a power transmission tower based on shadow assistance and rotation box detection of any of the above embodiments.
The present embodiment also provides a computer program product comprising a computer program stored on a non-volatile computer storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the method for identifying a power transmission tower based on shadow assistance and rotating frame detection of the above embodiments.
The present embodiment provides an electronic device including: one or more processors and memory. The electronic device may further include: input means and output means. The processor, memory, input devices, and output devices may be connected by a bus or other means. The memory is the non-volatile computer readable storage medium described above. The processor executes various functional applications and data processing of the server by running nonvolatile software programs, instructions and modules stored in the memory, that is, the power transmission tower identification method based on shadow assistance and rotation frame detection described in the above embodiment is implemented. The input device may receive input numeric or character information and generate key signal inputs related to user settings and function control of the transmission tower identification method based on shadow assistance and rotating frame detection. The output means may comprise a display device such as a display screen.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be realized by adopting various computer languages, such as object-oriented programming language Java, an transliteration script language JavaScript and the like. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. The power transmission tower identification method based on shadow assistance and rotating frame detection is characterized by comprising the following steps:
step one: inputting the remote sensing image to be detected into a lightweight characteristic extraction network, and enabling the lightweight characteristic extraction network to sequentially pass through the submodules M 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Extracting corresponding feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Sequentially reducing the space size, sequentially reducing the included image detail information, and sequentially increasing the included advanced semantic information; take feature map S 3 Feature map S 4 And feature map S 5 As a feature extraction network output feature map, and extracting the feature map S 5 Downsampling twice to obtain a feature map S 6 And feature map S 7
Step two: fusing the feature graphs of all resolutions by adopting a bidirectional weighted feature fusion network;
step three: determining the position information, the shadow position information and the shadow category information of the power transmission tower through rotating frame detection by utilizing the fused feature map;
step four: matching the power transmission tower and the shadow is realized by calculating the shortest Euclidean distance between each pair of shadows and the corner points of the power transmission tower, and the matched shadow categories are given to the power transmission tower, so that the classification of the power transmission tower categories is realized.
2. The method for identifying the power transmission tower based on shadow assistance and rotating frame detection according to claim 1, wherein the process of fusing the feature graphs of each resolution by adopting a bidirectional weighted feature fusion network is as follows: map S of the characteristics 7 Up-sampled and feature map S 6 Splicing on the channel dimension, and then carrying out convolution on the spliced characteristic images by using convolution check to obtain a characteristic image S' 6 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 6 Up-sampled and feature map S 5 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 5 Up-sampled and feature map S 4 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Map S 'of the characteristic' 4 Up-sampled and feature map S 3 Splicing on the channel dimension, and convoluting the spliced feature images by convolution check to obtain a feature image S' 3 The method comprises the steps of carrying out a first treatment on the surface of the After obtaining the characteristic diagram S' 4 And S' 3 After that, the feature map S' 3 Downsampled and feature map S' 4 And feature map S 4 Splicing, and convolving the spliced feature images by convolution check to obtain a feature image S' 4 The method comprises the steps of carrying out a first treatment on the surface of the Will characteristic diagram S 4 Downsampled and feature map S' 5 And feature map S 5 Splicing, and convolving the spliced feature images by convolution check to obtain a feature image S' 5 The method comprises the steps of carrying out a first treatment on the surface of the Will characteristic diagram S 5 Downsampled and feature map S' 6 And feature map S 6 Splicing, and then convoluting the spliced feature images by utilizing convolution check to obtain featuresGraph S' 6 After that, the feature map S 6 Downsampled and feature map S 7 Splicing, and convoluting the spliced feature images by convolution check to obtain a feature image S' 7 Finally, the characteristic diagram S' 7 Feature map S 6 Feature map S 5 Feature map S 4 And a characteristic diagram S' 3 And outputting the characteristic images as the fused characteristic images.
3. The method for identifying the power transmission tower based on shadow assistance and rotating frame detection according to claim 2, wherein the fused characteristic diagram S 'is utilized' 7 Feature map S 6 Feature map S 5 Feature map S 4 And a characteristic diagram S' 3 And predicting the position, the shadow position and the shadow category of the power transmission tower through the 1X 1 convolution layer.
4. A transmission tower identification method based on shadow assistance and rotating frame detection according to claim 3, wherein the transmission tower position, shadow position is determined by (c x ,c y W, h, θ) represents c x 、c y W, h, and θ denote the abscissa of the center point of the rotation frame, the width and height of the rotation frame, and the rotation angle, respectively.
5. The method for identifying a power transmission tower based on shadow assistance and rotating frame detection of claim 1, wherein (x) i ,y i ) i=1,2,3,4 Representing the position coordinates of 4 corner points of the power transmission tower, (x) j ,y j ) j=1,2,3,4 Representing the position coordinates of 4 angular points of the shadow, and calculating the shortest Euclidean distance between the power transmission tower and the shadowAnd if the shortest distance between the power transmission tower and the shadow is smaller than the threshold value, the target shadow is considered to be matched.
6. Power transmission tower identification system based on shadow assistance and rotating frame detectionThe method is characterized by comprising a target detection model and a power transmission tower shadow matching module, wherein the target detection model consists of a light-weight feature extraction network, a bidirectional weighting feature fusion network and a prediction network which are integrated with an attention mechanism, and a feature map S is extracted through the light-weight feature extraction network integrated with the attention mechanism 2 Feature map S 3 Feature map S 4 And feature map S 5 Feature map S 2 Feature map S 3 Feature map S 4 And feature map S 5 Sequentially reducing the space size, sequentially reducing the included image detail information, and sequentially increasing the included advanced semantic information; take feature map S 3 Feature map S 4 And feature map S 5 As a feature extraction network output feature map, and extracting the feature map S 5 Downsampling twice to obtain a feature map S 6 And feature map S 7 Then, the two-way weighted feature fusion network is used for feature graphs S of different scales 3 Feature map S 4 Feature map S 5 Feature map S 6 And feature map S 7 Fusing, wherein the prediction network carries out rotating frame detection based on the fused feature map, and determines the position information, shadow position information and shadow category information of the power transmission tower; the power transmission tower shadow matching module is used for matching the power transmission tower and the shadow by calculating the shortest Euclidean distance between each pair of shadows and the power transmission tower corner points.
7. The transmission tower identification system based on shadow assistance and rotating frame detection according to claim 6, wherein the lightweight feature extraction network integrating the attention mechanism comprises four sub-modules for feature extraction, sub-module M, which are sequentially connected in series 2 Submodule M 3 Submodule M 4 Sum submodule M 5 The method comprises the steps of carrying out a first treatment on the surface of the Sub-module M 2 Submodule M 3 Submodule M 4 Sum submodule M 5 Similar in structure, submodule M 2 Comprising 3 groups of residual blocks and attention modules, and a submodule M 3 Comprising 4 groups of residual blocks and attention modules, and a submodule M 4 Comprises 6 groups of residual blocks and attention modules, and a submodule M 5 Comprising 3 sets of residual blocks + attention modules, said attention modulesThe block is composed of a spatial attention module and a channel attention module.
8. The transmission tower identification system based on shadow assistance and rotating frame detection of claim 7, wherein each residual block is mapped to its own input signatureFirst by a convolution kernel of 1 x 1 +.>Dimension reduction is performed by convolving the block with 32 3 x 3 blocks +.>Extracting characteristics, splicing all grouping convolution output results together in parallel according to channels, and performing convolution of 1 multiplied by 1 +.>Up-dimension, finally adding residual connection to obtain self output characteristic diagram +.>Wherein C is in The number of channels representing the input feature map, C out The number of channels representing the output characteristic diagram, C o ' ut Representing the number of channels of the packet convolution, H representing the feature map, and W representing the width of the feature map;
feature enhancement with spatial attention module and channel attention module: outputting a feature mapFirst sum channel attention map M c ∈R C×1×1 Multiplying channel by channel to obtain a fused channel attention profile +.>Fusion channel attention profileF o ' ut Re-sum spatial attention map M s ∈R 1×H×W Multiplying the two parts by each other to obtain fusion, and obtaining a fusion channel attention and spatial attention characteristic diagram +.>
9. A non-volatile computer storage medium having stored thereon computer executable instructions for performing the method for identifying a power transmission tower based on shadow assistance and rotating frame detection of any one of claims 1-5.
10. An electronic device, comprising: the electronic device further includes one or more processors and memory: an input device and an output device; the method is characterized in that the processor executes various functional applications and data processing of the server by running nonvolatile software programs, instructions and modules stored in the memory to realize the power transmission tower identification method based on shadow assistance and rotating frame detection according to any one of claims 1-5.
CN202310766577.2A 2023-06-27 2023-06-27 Power transmission tower identification method and system based on shadow assistance and rotating frame detection Pending CN117036929A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310766577.2A CN117036929A (en) 2023-06-27 2023-06-27 Power transmission tower identification method and system based on shadow assistance and rotating frame detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310766577.2A CN117036929A (en) 2023-06-27 2023-06-27 Power transmission tower identification method and system based on shadow assistance and rotating frame detection

Publications (1)

Publication Number Publication Date
CN117036929A true CN117036929A (en) 2023-11-10

Family

ID=88625159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310766577.2A Pending CN117036929A (en) 2023-06-27 2023-06-27 Power transmission tower identification method and system based on shadow assistance and rotating frame detection

Country Status (1)

Country Link
CN (1) CN117036929A (en)

Similar Documents

Publication Publication Date Title
CN114202672A (en) Small target detection method based on attention mechanism
US20190041217A1 (en) Star tracker for mobile applications
CN112364843A (en) Plug-in aerial image target positioning detection method, system and equipment
CN113989616A (en) Target detection method, device, equipment and storage medium
CN107948586A (en) Trans-regional moving target detecting method and device based on video-splicing
CN114519819B (en) Remote sensing image target detection method based on global context awareness
Cao et al. Multi angle rotation object detection for remote sensing image based on modified feature pyramid networks
CN116844129A (en) Road side target detection method, system and device for multi-mode feature alignment fusion
CN110136049B (en) Positioning method based on fusion of looking-around image and wheel speed meter and vehicle-mounted terminal
Huang et al. Label-guided auxiliary training improves 3d object detector
CN113095316B (en) Image rotation target detection method based on multilevel fusion and angular point offset
Zhao et al. A flow base bi-path network for cross-scene video crowd understanding in aerial view
Moon et al. RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection
Kocur et al. Traffic camera calibration via vehicle vanishing point detection
CN112364864A (en) License plate recognition method and device, electronic equipment and storage medium
CN116597413A (en) Real-time traffic sign detection method based on improved YOLOv5
Chen et al. Alfpn: adaptive learning feature pyramid network for small object detection
CN117036929A (en) Power transmission tower identification method and system based on shadow assistance and rotating frame detection
Yeum et al. Automated recovery of structural drawing images collected from postdisaster reconnaissance
Xie et al. Lightweight and anchor-free frame detection strategy based on improved CenterNet for multiscale ships in SAR images
Hu Study on the Lightweighting Strategy of Target Detection Model with Deep Learning
CN112651351A (en) Data processing method and device
Porzi et al. An automatic image-to-DEM alignment approach for annotating mountains pictures on a smartphone
Wu et al. Attention-based object detection with saliency loss in remote sensing images
WO2024000728A1 (en) Monocular three-dimensional plane recovery method, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination