CN115424016A

CN115424016A - High-voltage transmission line identification method

Info

Publication number: CN115424016A
Application number: CN202211006355.2A
Authority: CN
Inventors: 刘传洋; 刘姚军; 刘景景; 孙佐; 陈林; 徐华结; 陈士博; 孔祥涛
Original assignee: Chizhou University
Current assignee: Chizhou University
Priority date: 2022-08-22
Filing date: 2022-08-22
Publication date: 2022-12-02

Abstract

The invention provides a high-voltage power transmission line recognition method, relates to the technical field of high-voltage power transmission line feature extraction, and aims to automatically recognize and extract power transmission line features in aerial images by utilizing a semantic segmentation network model, so that the power transmission line features in the images can be effectively extracted by using a trained model. According to the invention, a Res2Net residual module is used as a feature extraction network improved U-Net network model, and a plurality of receptive fields with finer granularity are obtained by grouping feature channels and obtaining layered connection in a filter bank mode, so that the extraction capability of linear features is effectively improved; cross-layer connection SE modules are arranged between corresponding layers of an encoder and a decoder, and linear feature enhancement is realized by giving corresponding attention weights to different feature maps; the characteristic fusion module is accessed to the bottom layer tail ends of the encoder and the decoder, the image characteristics are subjected to multi-scale mapping through cavity convolution, global information is obtained, and the identification precision of the power transmission line is improved.

Description

High-voltage transmission line identification method

Technical Field

The invention relates to the technical field of high-voltage transmission line feature extraction, in particular to a high-voltage transmission line identification method.

Background

With the increasing demand and dependence of social economy on stable and reliable operation of power grids, the power grids have become important components in economic development. The high-voltage transmission line is the most important component in the power grid, is responsible for the transmission and distribution tasks of electric energy, and has no obvious significance on the safety and stability of power supply. Because the span of the high-voltage transmission line is long, the power equipment is exposed outdoors for a long time, and the aging, damage and corrosion of the power equipment inevitably occur, so that great hidden dangers are left for the safe and stable operation of a power grid. Therefore, in order to ensure continuous and reliable power transmission, power companies regularly patrol transmission lines and distribution networks in different patrol manners.

In recent years, image processing technology, unmanned aerial vehicle control technology and computer vision technology are mature day by day, and the development of the power grid towards intellectualization and automation is promoted by relying on big data and mobile interconnection technology. The national power grid starts to use the unmanned aerial vehicle to replace manual inspection, the unmanned aerial vehicle is utilized to carry out power transmission line inspection, a large amount of funds and time can be saved, and inspection personnel can be prevented from engaging in dangerous work. However, in the low-altitude flight of the unmanned aerial vehicle, the power transmission line is one of the most threatening danger sources and the most difficult obstacle to avoid, so that the accurate identification of the high-voltage power transmission line in the aerial image has very important practical significance for the automatic obstacle avoidance and safe flight of the unmanned aerial vehicle and the safe and stable operation of a power grid. The existing high-voltage transmission line identification method mainly adopts a Hough transformation-based method, a Radon transformation-based method, an LSD-based method and a scanning mark-based method, but the method cannot effectively identify the transmission line in a plurality of straight lines, different threshold values need to be manually set according to different application scenes, and certain threshold value parameters need to be tried for multiple times to achieve the best effect, which obviously does not meet the requirement of automatic inspection of the transmission line.

Disclosure of Invention

Technical problem to be solved

Aiming at the defect problem of the high-voltage power transmission line identification method, the invention provides a high-voltage power transmission line identification method based on deep learning, which adopts a deep convolutional neural network to construct a semantic segmentation network model to automatically identify the high-voltage power transmission line in the power inspection image.

(II) technical scheme

In order to achieve the purpose, the invention is realized by the following technical scheme:

a high-voltage transmission line identification method comprises the following steps:

s1, constructing a semantic segmentation network model for identifying the high-voltage transmission line by adopting a deep convolution neural network;

s2, training and optimizing the semantic segmentation network model established in the step S1 on the aerial photography power line data set;

s3, loading the semantic segmentation network model trained in the step S2 to a cloud platform server, collecting an aerial power line image of the unmanned aerial vehicle, processing the image and transmitting the image to the cloud platform server, and further optimizing and training the semantic segmentation network model by the cloud platform server in a reinforcement learning mode;

and S4, deploying the semantic segmentation network model trained in the step S3 to an edge computing platform to perform high-voltage transmission line online identification.

Preferably, the semantic segmentation network model in the step S3 is an improved U-net network model, and is composed of an encoder, a feature fusion module, a decoder, and an attention mechanism module, where a bottommost layer of the encoder is connected with a bottommost layer of the decoder through the feature fusion module, and except the bottommost layer, feature layers of the encoder are connected with corresponding feature layers of the decoder through the attention mechanism module, respectively.

Preferably, the encoder comprises a first feature layer, a second feature layer, a third feature layer, a fourth feature layer, and a fifth feature layer; the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are residual error modules respectively, and the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are connected in sequence through a maximum pooling layer.

Preferably, the residual module comprises a first 1 × 1 convolution, a second 1 × 1 convolution, a first 3 × 3 convolution, a second 3 × 3 convolution, a third 3 × 3 convolution; after channel adjustment is carried out on an input feature diagram X00 through first 1X 1 convolution, the feature diagram is averagely divided into 4 groups, namely, the feature diagram X01, the feature diagram X02, the feature diagram X03 and the feature diagram X04, the feature diagram X01 directly outputs a feature diagram Y01, the feature diagram X02 outputs a feature diagram Y02 after first 3X 3 convolution operation, the feature diagram X03 and the feature diagram Y02 are fused and output a feature diagram Y03 after second 3X 3 convolution operation respectively, the feature diagram X04 and the feature diagram Y03 are fused and output a feature diagram Y04 after third 3X 3 convolution operation respectively, and the feature diagram Y01, the feature diagram Y02, the feature diagram Y03 and the feature diagram Y04 are firstly spliced Concat operation and then overlapped with the input feature diagram X00 after passing through a second 1X 1 convolution adjustment channel to output an Add output feature diagram Y00.

Preferably, the residual module comprises a first 1 x1 convolution, a second 1 x1 convolution, a first 3 x 3 convolution, a second 3 x 3 convolution, a third 3 x 3 convolution; after channel adjustment is performed on the input feature map X10 through a first 1 × 1 convolution, the feature map is averagely divided into 4 groups, namely, a feature map X11, a feature map X12, a feature map X13 and a feature map X14, the feature map X11 directly outputs a feature map Y11, the feature map X12 and the feature map Y11 are respectively fused and output a feature map Y12 after a first 3 × 3 convolution operation, the feature map X13, the feature map Y11 and the feature map Y12 are respectively fused and output a feature map Y13 after a second 3 × 3 convolution operation, the feature map X14, the feature map Y11, the feature map Y12 and the feature map Y13 are respectively fused and output a feature map Y14 after a third 3 convolution operation, and the feature map Y11, the feature map Y12, the feature map Y13 and the feature map Y14 are firstly subjected to a second 1 × 1 convolution adjustment channel through a splicing Concat first, and then are superimposed with the input feature map X10 to output a feature map Y10.

Preferably, the feature fusion module is a pyramid module formed by convolution of 4 holes, the convolution of 4 holes is 3 × 3 convolution, the expansion rates of the convolution of 4 holes are 1, 6, 12 and 18 respectively, the convolution of 4 holes are connected in parallel, and after the input feature graph and the parallel output splicing Concat of the convolution of 4 holes are operated, the output feature graph is adjusted through a convolution channel of 1 × 1.

Preferably, the attention mechanism module is an Squeeze-and-Excitation (SE) module, and the SE module includes a global average pooling layer, a first fully-connected layer, an activation function layer, a second fully-connected layer, and a Sigmoid function layer; the input feature map is subjected to pooling operation of a global average pooling layer, vectors corresponding to the number of channels are obtained at a first full-connection layer, vectors equal to the number of channels are obtained at a second full-connection layer after channel compression and activation function layer operation, weight vectors are obtained through Sigmoid function layer normalization, and the input feature map is multiplied by the corresponding weight vectors to obtain an output feature map.

(III) advantageous effects

The invention has the beneficial effects that: a high-voltage transmission line identification method takes a Res2Net residual module as a feature extraction network improved U-Net network model, and by grouping feature channels and obtaining layered connection in a filter bank mode, a plurality of receptive fields with finer granularity are obtained, and the extraction capability of linear features is effectively improved; connecting SE modules in a cross-layer manner between corresponding layers of an encoder and a decoder, and realizing linear feature enhancement by endowing corresponding attention weights to different feature maps; and a feature fusion module is accessed to the bottom layer tail ends of the encoder and the decoder, and the image features are subjected to multi-scale mapping through cavity convolution to obtain global information and improve the identification precision of the power transmission line.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flow chart of a high voltage transmission line identification method of the present invention;

FIG. 2 is a diagram of an improved U-net semantic segmentation network model architecture of the present invention;

FIG. 3 is a block diagram of a residual error module according to an embodiment of the present invention;

FIG. 4 is a block diagram of another embodiment of a residual module according to the present invention;

FIG. 5 is a block diagram of a feature fusion module of the present invention;

FIG. 6 is a schematic diagram of a modular module of the present invention;

fig. 7 shows the result of identifying power line by semantic segmentation network model according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

With reference to fig. 1, a method for identifying a high-voltage transmission line includes the following steps:

s1, adopting a deep convolutional neural network to construct a semantic segmentation network model for high-voltage transmission line identification;

s3, loading the semantic segmentation network model trained in the step S2 to a cloud platform server, acquiring an aerial power line image of the unmanned aerial vehicle, processing the image and transmitting the processed image to the cloud platform server, and further optimizing and training the semantic segmentation network model by the cloud platform server in a reinforcement learning mode;

and S4, deploying the semantic segmentation network model trained in the step S3 to an edge computing platform to perform online identification of the high-voltage transmission line.

With reference to fig. 2, step S3 semantically segments the network model into an improved U-net network model. The improved U-net network model consists of an encoder, a feature fusion module, a decoder and an attention mechanism module, wherein the bottommost layer of the encoder is connected with the bottommost layer of the decoder through the feature fusion module, and the feature layers of the encoder except the bottommost layer are respectively connected with the corresponding feature layers of the decoder through the attention mechanism module.

The encoder comprises a first characteristic layer, a second characteristic layer, a third characteristic layer, a fourth characteristic layer and a fifth characteristic layer; the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are residual error modules respectively, and the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are connected in sequence through a maximum pooling layer.

In order to prevent the convergence failure caused by gradient disappearance due to network deepening, the method adopts a Res2Net residual error module to group the characteristic channels, obtains hierarchical connection in a filter bank mode, and obtains a plurality of receptive fields with finer granularity. The residual module adopts a Res2Net residual module to replace an original ResNet residual module, compared with other residual module structures, the Res2Net residual module is provided with residual hierarchy connection inside the module, and a group of 3 x 3 convolutional layers is used for replacing the original convolutional layers, so that the whole network has smaller granularity detection capability. Fig. 3 and 4 show two embodiments of Res2Net residual modules. The 3 × 3 convolution under each feature subgraph in the Res2Net residual module can utilize previous features, the output of the convolution can obtain a larger receptive field, the scale is increased in a single layer, the receptive field range is enlarged, context information is better utilized, and a classifier can more easily detect specific targets by fully combining the context information.

Fig. 3 is an embodiment of the Res2Net residual block, which includes a first 1 × 1 convolution, a second 1 × 1 convolution, a first 3 × 3 convolution, a second 3 × 3 convolution, and a third 3 × 3 convolution; after channel adjustment is carried out on an input feature diagram X00 through a first 1 × 1 convolution, the feature diagram is averagely divided into 4 groups, namely a feature diagram X01, a feature diagram X02, a feature diagram X03 and a feature diagram X04, the feature diagram X01 directly outputs a feature diagram Y01, the feature diagram X02 outputs a feature diagram Y02 after a first 3 × 3 convolution operation, the feature diagram X03 and the feature diagram Y02 respectively fuse and output a feature diagram Y03 after a second 3 × 3 convolution operation, the feature diagram X04 and the feature diagram Y03 respectively fuse and output a feature diagram Y04 after a third 3 × 3 convolution operation, and the feature diagram Y01, the feature diagram Y02, the feature diagram Y03 and the feature diagram Y04 are firstly spliced Concat operation and then pass through a second 1 × 1 convolution adjustment channel to be superposed with the input feature diagram X00 to output an addd output feature diagram Y00.

Fig. 4 is another embodiment of the Res2Net residual block, which includes a first 1 × 1 convolution, a second 1 × 1 convolution, a first 3 × 3 convolution, a second 3 × 3 convolution, and a third 3 × 3 convolution; after channel adjustment is performed on the input feature map X10 through a first 1 × 1 convolution, the feature map is averagely divided into 4 groups, namely, a feature map X11, a feature map X12, a feature map X13 and a feature map X14, the feature map X11 directly outputs a feature map Y11, the feature map X12 and the feature map Y11 are respectively fused and output a feature map Y12 after a first 3 × 3 convolution operation, the feature map X13, the feature map Y11 and the feature map Y12 are respectively fused and output a feature map Y13 after a second 3 × 3 convolution operation, the feature map X14, the feature map Y11, the feature map Y12 and the feature map Y13 are respectively fused and output a feature map Y14 after a third 3 convolution operation, and the feature map Y11, the feature map Y12, the feature map Y13 and the feature map Y14 are firstly subjected to a second 1 × 1 convolution adjustment channel through a splicing Concat first, and then are superimposed with the input feature map X10 to output a feature map Y10.

In order to enrich the extracted features and prevent the deepened network from gradient explosion, a feature fusion module is accessed to the bottom ends of the encoder and the decoder. With reference to fig. 5, the feature fusion module is a pyramid module formed by convolution of 4 holes, the convolution of 4 holes is 3 × 3 convolution, the expansion rates thereof are 1, 6, 12 and 18, the convolution of 4 holes are connected in parallel, and after the input feature graph and the parallel output concatenation Concat of the convolution of 4 holes are operated, the output feature graph is adjusted through a convolution channel 1 × 1.

In order to endow different attention weights to different feature maps and realize the enhancement and extraction of linear features of the power transmission line, an attention mechanism SE module is introduced into jump connection between corresponding layers of an encoder and a decoder. With reference to fig. 6, the se module includes a global average pooling layer, a first full-connection layer, an activation function layer, a second full-connection layer, and a Sigmoid function layer; the input feature map is subjected to pooling operation of a global average pooling layer, vectors corresponding to the number of channels are obtained at a first full-connection layer, vectors equal to the number of the channels are obtained at a second full-connection layer after channel compression and activation function layer operation, finally weight vectors are obtained through Sigmoid function layer normalization, and the input feature map is multiplied by the corresponding weight vectors to obtain an output feature map.

The decoder is composed of depth separable residual convolution, and by means of bilinear interpolation up-sampling, not only can shallow layer and deep layer feature fusion be achieved, feature extraction capability is improved, but also the operation speed of a semantic segmentation model is improved, and the size of the model is reduced as much as possible while identification precision is guaranteed, and calculation complexity is reduced.

Fig. 7 shows the power line feature extraction results of the present invention under different backgrounds, and the present invention can realize automatic identification and extraction of high voltage power line features in aerial images.

In summary, in the embodiment of the present invention, a Res2Net residual module is used as a feature extraction network to improve a U-Net network model, and by grouping feature channels and obtaining hierarchical connection in a filter bank form, multiple receptive fields with finer granularity are obtained, so as to effectively improve the extraction capability of linear features; connecting SE modules in a cross-layer manner between corresponding layers of an encoder and a decoder, and realizing linear feature enhancement by endowing corresponding attention weights to different feature maps; and a feature fusion module is accessed to the bottom layer tail ends of the encoder and the decoder, and the image features are subjected to multi-scale mapping through cavity convolution to obtain global information and improve the identification precision of the power transmission line. The method utilizes the semantic segmentation network to automatically identify and extract the high-voltage power transmission line characteristics in the aerial image, and can quickly extract the high-voltage power transmission line characteristics in the image by using a trained model.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A high-voltage transmission line identification method is characterized by comprising the following steps:

2. The method for identifying high-voltage power transmission line according to claim 1, wherein said step S3 of semantically segmenting the network model into an improved U-net network model, which is composed of an encoder, a feature fusion module, a decoder, and an attention mechanism module, wherein the lowest layer of said encoder is connected to the lowest layer of said decoder through said feature fusion module, and the feature layers of said encoder except the lowest layer are respectively connected to the corresponding feature layers of said decoder through said attention mechanism module.

3. A high voltage transmission line identification method according to claim 2, characterized in that said encoder comprises a first signature layer, a second signature layer, a third signature layer, a fourth signature layer, a fifth signature layer; the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are residual error modules respectively, and the first characteristic layer, the second characteristic layer, the third characteristic layer, the fourth characteristic layer and the fifth characteristic layer are connected in sequence through a maximum pooling layer.

4. A method of identifying high voltage power transmission lines according to claim 3, wherein said residual error module comprises a first 1 x1 convolution, a second 1 x1 convolution, a first 3 x 3 convolution, a second 3 x 3 convolution, a third 3 x 3 convolution; after channel adjustment is carried out on an input feature diagram X00 through a first 1 × 1 convolution, the feature diagram is averagely divided into 4 groups, namely a feature diagram X01, a feature diagram X02, a feature diagram X03 and a feature diagram X04, the feature diagram X01 directly outputs a feature diagram Y01, the feature diagram X02 outputs a feature diagram Y02 after a first 3 × 3 convolution operation, the feature diagram X03 and the feature diagram Y02 respectively fuse and output a feature diagram Y03 after a second 3 × 3 convolution operation, the feature diagram X04 and the feature diagram Y03 respectively fuse and output a feature diagram Y04 after a third 3 × 3 convolution operation, and the feature diagram Y01, the feature diagram Y02, the feature diagram Y03 and the feature diagram Y04 are firstly spliced Concat operation and then pass through a second 1 × 1 convolution adjustment channel to be superposed with the input feature diagram X00 to output an addd output feature diagram Y00.

5. A method for identifying high voltage power transmission lines according to claim 3, wherein said residual error module comprises a first 1 x1 convolution, a second 1 x1 convolution, a first 3 x 3 convolution, a second 3 x 3 convolution, a third 3 x 3 convolution; after channel adjustment is performed on an input feature map X10 through first 1 × 1 convolution, the feature map is averagely divided into 4 groups, namely a feature map X11, a feature map X12, a feature map X13 and a feature map X14, the feature map X11 directly outputs a feature map Y11, the feature map X12 and the feature map Y11 are fused and output a feature map Y12 after first 3 × 3 convolution operation respectively, the feature map X13, the feature map Y11 and the feature map Y12 are fused and output a feature map Y13 after second 3 × 3 convolution operation respectively, the feature map X14, the feature map Y11, the feature map Y12 and the feature map Y13 are fused and output a feature map Y14 after third 3 × 3 convolution operation respectively, and the feature map Y11, the feature map Y12, the feature map Y13 and the feature map Y14 are superimposed with the input feature map X10 and output a feature map Y10 after Concat first Concat a second 1 convolution adjustment channel.

6. The method according to claim 2, wherein the characteristic fusion module is a pyramid module formed by convolution of 4 holes, the convolution of 4 holes is 3 x 3 convolution, the expansion rates of the convolution are 1, 6, 12 and 18, the convolution of 4 holes are connected in parallel, and the input characteristic diagram and the parallel output splicing Concat of the convolution of 4 holes are operated, and then the output characteristic diagram is adjusted through a 1 x1 convolution channel.

7. A method for identifying high voltage transmission lines according to claim 2, characterized in that said attention mechanism module is an Squeeze-and-Excitation (SE) module comprising a global averaging pooling layer, a first fully connected layer, an activation function layer, a second fully connected layer, a Sigmoid function layer; the input feature map is subjected to pooling operation of a global average pooling layer, vectors corresponding to the number of channels are obtained at a first full-connection layer, vectors equal to the number of channels are obtained at a second full-connection layer after channel compression and activation function layer operation, weight vectors are obtained through Sigmoid function layer normalization, and the input feature map is multiplied by the corresponding weight vectors to obtain an output feature map.