CN114241413A - Substation multi-target detection method based on attention mechanism and feature balance - Google Patents

Substation multi-target detection method based on attention mechanism and feature balance Download PDF

Info

Publication number
CN114241413A
CN114241413A CN202111544623.1A CN202111544623A CN114241413A CN 114241413 A CN114241413 A CN 114241413A CN 202111544623 A CN202111544623 A CN 202111544623A CN 114241413 A CN114241413 A CN 114241413A
Authority
CN
China
Prior art keywords
target detection
feature
substation
network
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111544623.1A
Other languages
Chinese (zh)
Inventor
朱新山
李亚霖
郭志民
李斌
王帅
曾筠婷
屈璐瑶
田杨阳
刘昊
赵健
毛万登
贺翔
张小斐
袁少光
耿俊成
魏小昭
陈岑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Original Assignee
Tianjin University
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Henan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University, State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Henan Electric Power Co Ltd filed Critical Tianjin University
Priority to CN202111544623.1A priority Critical patent/CN114241413A/en
Publication of CN114241413A publication Critical patent/CN114241413A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The substation multi-target detection method based on the attention mechanism and the feature balance comprises the following steps: acquiring an image data sample of the substation equipment, extracting target information in the image data sample by using data annotation software, and constructing a data set for multi-target detection of the substation by using the target information; establishing a multi-target detection network of the transformer substation based on a dark net-53 network and an attention mechanism, sequentially carrying out feature balance and feature fusion on each feature map to obtain a fusion feature map, and carrying out multi-branch detection on the fusion feature map to obtain a multi-target detection result of the transformer substation; performing iterative training on the multi-target detection network of the transformer substation by using a data set for multi-target detection of the transformer substation based on a transfer learning method; and detecting the substation equipment by using the trained multi-target detection network. The method solves the problems that the target with a complex background and the target with a similar appearance are difficult to identify because different targets are shielded mutually, and therefore the operation and maintenance efficiency of the transformer substation is improved.

Description

Substation multi-target detection method based on attention mechanism and feature balance
Technical Field
The invention belongs to the technical field of power inspection, and particularly relates to a substation multi-target detection method based on an attention mechanism and characteristic balance.
Background
The high-efficiency operation and maintenance of the intelligent power grid are significant for improving the reliability and the service level of the power grid. In recent years, the contradiction between the scale expansion of the power grid and the relative tension of operation and maintenance personnel is increasingly prominent, risk factors influencing the safety of the power grid exist for a long time, and the traditional operation and maintenance mode is difficult to adapt to the requirement of rapid development of the power grid. At present, the inspection work of the transformer substation mainly adopts a manual inspection mode, the phenomena of inspection inaccurateness, such as 'missing inspection', 'delay inspection', 'no inspection' and the like, often occur, the labor intensity is high, the detection quality seriously depends on the quality of inspection personnel, and the detection data can not be accurately and timely accessed into a management information system. Under high pressure and some bad weather conditions, there is still very big potential safety hazard in artifical the inspection. At present, intelligent technology is urgently needed for power grid inspection, so that the target detection technology has great significance for intelligent inspection of the power grid. In the target detection technology for the transformer substation, a plurality of difficulties are faced in the target detection of the transformer substation equipment due to the insufficient samples, the complex background of the equipment image caused by the complex scene of the transformer substation and other factors. The image from the substation has the following characteristics: (1) the transformer substation equipment is various in types and has industrial specificity; (2) images in outdoor open-type transformer stations are very complex, and a plurality of identical or similar targets such as insulators, main transformer outgoing line sleeves and isolating switch supporting insulation often exist; (3) the image resolution from the video is lower. The characteristics increase the processing difficulty of the substation equipment image, and the actual engineering requirements cannot be met by directly using the existing target detection technology.
In the prior art, the target detection algorithm can be generally divided into a target detection method based on manual feature extraction and a target detection method based on automatic feature extraction of deep learning.
The traditional target detection algorithm is mainly based on the traditional manual characteristic detection method. Paul Viola et al uses a sliding window Detection method Using an integral map to speed up feature extraction, but the speed and accuracy of this method are difficult to adapt to complex substation scenarios (Viola P, Jones M. Rapid Object Detection Using a boost case of Simple Features [ C ]. IEEE Conference on Computer Vision and Pattern Recognition,2001,1: 511). The HOG feature is the basis of all target detectors based on gradient features, the most original concept of multi-scale pyramid + sliding window is adopted for Detection, but the HOG feature is difficult to meet the requirements of precision and speed for monitoring substation equipment in real time (Dalal N., Triggs B., Histograms of ordered Gradients for Human Detection [ C ]. IEEE Computer Society Conference on Computer Vision & Pattern registration. IEEE Computer Society,2005: 886-. Felzenzwald et al originally proposed a DPM (Deformable Part-based Model), split and convert the overall detection problem of a target in a conventional target algorithm into detection problems of each component of the Model, and then aggregate the detection results of each component to obtain a final detection result. The DPM Model adopts a weak supervised learning strategy, so that the accuracy of the algorithm is improved, but compared with a deep learning detection Model, the accuracy still cannot meet the requirement of substation target detection (Felzenzwald P., McAllester D., Ramanan D.A discrimination Trained, Multiscale, Deformable Part Model [ C ]. IEEE Computer Society reference on Computer Vision & Pattern recognition.2008,8: 1-8.).
The conventional method has a major problem in that the features for target detection are mainly designed manually. The problems of high difficulty in feature selection, unclear feature combination mode and the like are caused, and the final model detection effect is difficult to meet the requirement of intelligent operation and maintenance of the transformer substation. On the other hand, in the face of the fact that different targets have different characteristics, the traditional method is difficult to extract universally-adaptive characteristics from the targets to solve the problem of multi-target detection, and the traditional method is complex in design scheme, poor in generalization and greatly limited in performance.
In recent years, deep neural networks have been widely used in various fields, and have achieved excellent effects. Such as object detection, image segmentation, natural language processing, etc. The application of deep convolutional neural networks makes the performance and speed of target detection a qualitative leap. The convolutional neural network can automatically learn very effective and robust characteristics from data, and optimizes the classifier, so that the detection speed is considered, and the multi-target detection precision is improved. However, the integration of the deep learning technology and the substation target detection still faces many challenges, including complex operation and maintenance scenes of the substation, strong environmental interference, insufficient samples for learning, and the like. The existing substation inspection level is difficult to effectively meet the requirement of high intelligent operation and maintenance management and control of the future power grid on equipment state perception and cognition, and the intelligent operation and maintenance of the power grid by deeply applying a new generation artificial intelligence technology is urgently needed.
Mucheng et al utilize the multi-scale trained YOLOv3 target detection network to realize the identification of transmission line anti-bird thorn parts, and realize higher detection accuracy under the condition of limited sample number by means of transfer learning. However, due to complex background and many similar targets of the substation, the scheme is difficult to adapt to the substation target detection work with complex background (muschiren, forest, river-space, old quiet, liu xin yu, zhuang yu n. transmission line anti-bird-stab component identification and fault detection based on the deep convolutional neural network [ J ]. power grid technology, 2021,45(01): 126-. Haoshi et al introduces the attention module into a backbone extraction network of YOLOv3, and improves the fault detection precision of the power transmission line. However, similar targets are more in a transformer substation scene, the shielding of the targets is serious, and the improved module is difficult to adapt to application in a new scene. Furthermore, the introduction of too many attention modules in the network tends to confuse the feature extraction part of the network, and the effect is not good in multi-target detection in the substation (haushuai, mazu, Zhao Xinsheng, Ankuyi, Zusau, Maxu. the YOLOv3 transmission line fault detection method based on the rolling block attention model [ J ] power grid technology, 2021,45(08): 2979-. Qin et al consider that the weight representation obtained using global mean pooling lacks diversity and mathematically demonstrates that global mean pooling is the lowest frequency component in the feature map discrete cosine transform. Based on the method, an FcaNet module capable of screening more frequency components is provided as a new attention generation scheme. However, since the neural network mainly extracts low-Frequency components in feature extraction, the improvement of target detection effect by introducing a plurality of other Frequency components in engineering is not great (Qin Z., Zhang P., Wu F., et al. He et al propose a residual learning framework to accommodate neural networks of increasing depth. The application of the residual block can further improve the performance of the traditional computer vision network, and theoretically prove that the introduction of the residual block can certainly improve the performance of the neural network, so that the introduction of the residual block into a new target detection framework can improve the identification precision of the target detection network (He K, Zhang X, Ren S, et al. deep residual learning for image recognition [ C ]// Proceedings of the IEEE reference on computer vision and pattern recognition.2016: 770-778.).
At present, adding feature fusion into a target detection framework is an effective method. Tan et al consider that the conventional feature fusion method directly fuses feature maps of different resolutions, and never consider that the feature maps of different feature resolutions have different importance to the prediction result. Therefore, a simple and Efficient weighted bidirectional feature pyramid BIFPN is provided, which introduces learnable weights to learn the importance of different feature maps, and repeatedly applies top-down and bottom-up feature multi-scale feature fusion, thereby improving the feature extraction and target prediction capabilities of the network (Tan M., Page R., Le Q V., efficiency Object Detection [ C ]// procedures of the IEEE/CVF conference on computer vision and pattern recognition.2020: 10781-. Szegdy et al propose an inclusion network architecture for network adaptive selection of convolution kernel size. This architecture allows the neural network to automatically select the appropriate features, thereby improving the ability of the network to extract features (Szegedy C, Liu W, Jia Y, et al. good deteppers with constraints [ C ]// Proceedings of the IEEE conference on computer vision and pattern registration. 2015: 1-9.). And replacing the original feature fusion part of yolov3 with a BIFPN feature fusion module in the Tangyu et al, strengthening feature interaction among multi-size feature maps and increasing the recognition accuracy of the yolov3 target recognition network. However, the method increases network parameters significantly, and occupies excessive system computing resources (spin, win, power transformation equipment infrared thermal image recognition method [ J/OL ] based on image enhancement and depth learning, China Motor engineering newspaper: 1-10[2021-09-13]. http:// kns.cnki.net/kcms/detail/11.2107.tm.20210601.1000.002. html.).
The multi-target identification technology based on the convolutional neural network can automatically extract effective characteristics of data samples. Training is carried out on a large-scale data set, the capability of the network for adapting to specific tasks is enhanced on the premise of keeping the generalization of the network, and the effect which is obviously superior to that of the traditional method is achieved. However, the application scenario of the existing multi-target detection algorithm in the power grid is generally simpler, most of the multi-target detection algorithm is aimed at target identification and fault detection of the power transmission line, and research on the power transmission line in the substation is relatively less. Some scholars identify target equipment in the substation by improving the existing YOLOv3 algorithm, but the detected target types are limited, and the problems of similar targets and complex backgrounds are difficult to solve.
Disclosure of Invention
In order to solve the defects in the prior art, the invention aims to provide a substation multi-target detection method based on attention mechanism and characteristic balance, solve the problems that a target with a complex background and a target with a similar appearance are difficult to identify and different targets are shielded from each other, and provide the substation multi-target detection method based on attention mechanism and characteristic balance, so that the operation and maintenance efficiency of a substation is improved.
The invention adopts the following technical scheme.
The substation multi-target detection method based on the attention mechanism and the feature balance comprises the following steps:
step 1, obtaining image data samples of substation equipment, recording target information of each image data sample by using data annotation software, and constructing a data set for multi-target detection of a substation by using the target information;
step 2, establishing a multi-target detection network of the transformer substation based on a dark net-53 network and an attention mechanism, sequentially carrying out feature balance and feature fusion on each feature map by using the multi-target detection network to obtain a fusion feature map, and carrying out multi-branch detection on the fusion feature map to obtain a multi-target detection result of the transformer substation;
step 3, performing iterative training on the multi-target detection network of the transformer substation by using a data set of the multi-target detection of the transformer substation based on a transfer learning method;
and 4, detecting the substation equipment by using the trained multi-target detection network.
Preferably, in step 1, under different illumination, time and weather conditions, image data samples of the substation equipment are collected; the data samples collected include: the transformer comprises a main transformer oil conservator, a main transformer outgoing line sleeve, a main transformer heat dissipation device, an insulator string, a mutual inductor, an equalizing ring, an isolating switch, a nameplate and an instrument panel;
the target information includes a category of the target area and a location of the target area.
Preferably, step 2 comprises:
step 2.1, forming an improved darknet-53 backbone network based on the darknet-53 network and the attention mechanism, and generating feature maps with different resolutions by the improved darknet-53 backbone network;
step 2.2, carrying out feature balance and feature fusion on each feature map in sequence to obtain a fused feature map;
and 2.3, detecting the fused feature graph, and taking the obtained features as a multi-target detection result of the transformer substation.
Preferably, step 2.1 comprises:
step 2.1.1, forming a residual error unit by connecting a plurality of groups of convolution blocks in series; the convolution block includes: convolution operation, batch normalization operation and activation function operation;
step 2.1.2, connecting different numbers of residual error units in series to form different residual error blocks;
step 2.1.3, forming a darknet-53 module by using the first residual block, the second residual block, the third residual block, the fourth residual block and the fifth residual block;
step 2.1.4, on the basis of the darknet-53 module, adding a mixed attention module into a residual block for generating a characteristic diagram for prediction to form an improved darknet-53 backbone network; feature maps of different resolutions are generated by the modified darknet-53 backbone network.
In step 2.1.4, the mixed attention module is added before the next convolution in the third, fourth and fifth residual blocks, which generate the feature map for prediction.
The hybrid attention module comprises a spatial attention submodule and a channel attention submodule connected in series;
the channel attention submodule simultaneously uses maximum pooling and average pooling for preprocessing channel weight, namely, weight results after maximum pooling and average pooling are added, and the sum is normalized, and a normalized value is used as the weight of each channel of the original characteristic diagram.
Preferably, step 2.2 comprises:
step 2.2.1, carrying out feature balance on each feature map, namely carrying out convolution operation on each feature map to ensure that the feature depths among the feature maps are consistent;
step 2.2.2, performing feature fusion on the feature map after feature balance, namely performing deconvolution operation on the feature map after feature balance to realize upsampling of the feature map; and carrying out feature fusion on the feature map obtained by the up-sampling and the feature map generated by the improved darknet-53 backbone network to obtain a fusion feature map.
Preferably, step 2.3 comprises:
step 2.3.1, in the characteristic diagram of each resolution, a plurality of anchor frames with different sizes are placed at each pixel position;
step 2.3.2, detecting the anchor frame based on a classification and regression algorithm; the detection comprises the following steps:
1) classifying branch detection to obtain the category of the anchor frame of each pixel point;
2) performing frame regression detection, and acquiring the coordinates of the central point of the target area and the offset of the width and the height of the target identification frame relative to the anchor frame;
and 2.3.3, performing convolution operation on the classification branch detection result and the frame regression detection result to extract detection characteristics, and taking the detection characteristics as a multi-target detection result of the transformer substation.
And the size of each anchor frame is obtained by clustering the target size of the data set of the multi-target detection of the transformer substation.
Preferably, step 3 comprises:
step 3.1, performing data enhancement and expansion on a data set of the multi-target detection of the power station by using color gamut conversion, turning and image mirroring;
step 3.2, loading the pre-training weight parameters of the multi-target detection network;
and 3.3, performing iterative training on the multi-target detection network of the transformer substation by adopting an adam optimizer as a training optimizer based on the transfer learning method.
And 3.2, controlling the overfitting risk of the multi-target detection network of the transformer substation in iterative training based on an L2 regularization method.
Step 3.2 comprises that the total period number of the training of the multi-target detection network is 100, and when the total period number of the training of the multi-target detection network is less than or equal to 50, the parameters of the trunk network of the darknet-53 are frozen for training; and when the training cycle number of the multi-target detection network is more than 50, training the whole multi-target detection network.
In the step 3.3, a data set of the multi-target detection of the transformer substation comprises category information and position information, and category detection values and position detection values are obtained through iterative training; comparing the class detection value with the class real value of the corresponding label, and comparing the position detection value with the position real value of the corresponding label to obtain class difference and position difference; and feeding back the category difference and the position difference to the multi-target detection network of the transformer substation in a gradient flow mode, and updating network parameters.
Preferably, in step 4, the trained multi-target detection network is deployed on the test system, and the multi-target detection result of the substation equipment is obtained by taking the image data of the substation equipment as input.
Compared with the prior art, the method has the advantages that the importance screening can be automatically carried out on different channels of the feature map during feature extraction, and the feature extraction capability of the network is enhanced; the characteristic map imbalance of the characteristic fusion can be relieved, and the prediction precision is improved; the problem that targets in the transformer substation are shielded from each other can be solved, and the probability of recognizing the shielded targets is increased; the interference of the complex environment of the transformer substation can be adapted, the end-to-end multi-target detection of the transformer substation is realized, and the detection result is directly output; better results can be obtained on large-scale data sets.
Drawings
FIG. 1 is a block diagram of the steps of a substation multi-target detection method based on attention mechanism and feature balance according to the present invention;
FIG. 2 is an original image of substation-related equipment in an embodiment of the present invention;
FIG. 3 is a schematic diagram of substation data set labeling in an embodiment of the present invention;
FIG. 4 is a schematic diagram of the structure of a darknet-53 backbone network in an embodiment of the present invention;
fig. 5 is a schematic diagram of a test result of the substation test set in the embodiment of the present invention.
Detailed Description
The present application is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.
As shown in fig. 1, the substation multi-target detection method based on attention mechanism and feature balance includes:
the method comprises the following steps of 1, obtaining image data samples of substation equipment, recording target information of each image data sample by using data annotation software, and constructing a data set for multi-target detection of the substation by using the target information.
It should be noted that, in the preferred embodiment of the present invention, it is a non-limiting preferred option to collect images of substation devices as raw data sets, and a person skilled in the art may collect target images in different scenes as raw data sets according to detection requirements. The network structure provided by the invention can also be used for target detection of video streams, and is not limited to image detection.
Preferably, in step 1, under different illumination, time and weather conditions, image data samples of the substation equipment are collected; the data samples collected include: the main transformer oil conservator, the main transformer outlet sleeve, the main transformer heat abstractor, insulator chain, mutual-inductor, equalizer ring, isolator, data plate, panel board.
It should be noted that, in the preferred embodiment of the present invention, it is preferable to provide 9 target samples as the image data samples of the substation equipment, and a person skilled in the art may collect different numbers of target samples at different positions as the image data samples according to the detection requirement.
In the preferred embodiment of the present invention, 3103 pieces of image data of a certain substation are selected, and a part of original images are shown in fig. 2. The number of each sample collection is not less than 100, the number of images containing a small number of devices such as a main transformer outgoing line casing and an oil conservator is not less than 200, and the number of target types contained in each image is as large as possible.
The target information includes a category of the target area and a location of the target area. In the preferred embodiment of the invention, a picture marking tool Labellmg is used for data marking, and the marking effect of the picture is shown in FIG. 3.
And 2, establishing a multi-target detection network of the transformer substation based on the dark-53 network and the attention mechanism, sequentially carrying out feature balance and feature fusion on each feature map by using the multi-target detection network to obtain a fusion feature map, and carrying out multi-branch detection on the fusion feature map to obtain a multi-target detection result of the transformer substation.
Preferably, as shown in fig. 4, step 2 comprises:
step 2.1, forming an improved darknet-53 backbone network based on the darknet-53 network and the attention mechanism, and generating feature maps with different resolutions by the improved darknet-53 backbone network;
step 2.2, carrying out feature balance and feature fusion on each feature map in sequence to obtain a fused feature map;
and 2.3, detecting the fused feature graph, and taking the obtained features as a multi-target detection result of the transformer substation.
Preferably, step 2.1 comprises:
step 2.1.1, forming a residual error unit by connecting a plurality of groups of convolution blocks in series; the convolution block includes: convolution operation, batch normalization operation and activation function operation;
in the preferred embodiment of the present invention, each residual unit is composed of two 3 × 3 convolutions and one 1 × 1 convolution, where convolution refers to the integral operation formed by convolution + batch normalized BN + activation function leak _ Relu.
And 2.1.2, connecting different numbers of residual error units in series to form different residual error blocks.
Step 2.1.3, forming a darknet-53 module by using the first residual block, the second residual block, the third residual block, the fourth residual block and the fifth residual block; in the preferred embodiment of the present invention, there are 5 residual blocks from the shallow layer to the deep layer, wherein the number of residual units in the residual blocks is 1, 2, 8,8, 4 in sequence, and the resolution of the feature map output by each residual block is gradually reduced.
Step 2.1.4, on the basis of the darknet-53 module, adding a mixed attention module into a residual block for generating a characteristic diagram for prediction to form an improved darknet-53 backbone network; feature maps of different resolutions are generated by the modified darknet-53 backbone network.
In step 2.1.4, the mixed attention module is added before the next convolution in the third, fourth and fifth residual blocks, which generate the feature map for prediction.
The hybrid attention module comprises a spatial attention submodule and a channel attention submodule connected in series;
the channel attention submodule simultaneously uses maximum pooling and average pooling for preprocessing channel weight, namely, weight results after maximum pooling and average pooling are added, and the sum is normalized, and a normalized value is used as the weight of each channel of the original characteristic diagram.
In the preferred embodiment of the invention, with continuous convolution, the feature diagram size is reduced, the number of channels is increased, and finally, the backbone network outputs three size feature diagrams of 52 × 52 × 256, 26 × 26 × 512 and 13 × 13 × 1024 to enter a subsequent prediction link to realize target detection. When the feature map passes through the corresponding mixed attention module in the backbone network, the difference of the importance of different channels of the feature map is improved, and the redundancy of feature information is reduced.
Preferably, step 2.2 comprises:
step 2.2.1, carrying out feature balance on each feature map, namely carrying out convolution operation on each feature map to ensure that the feature depths among the feature maps are consistent;
the resolution of the feature map output by a series of convolutions is continuously reduced, and the number of channels of the feature map is continuously increased. And adding a corresponding number of convolution layers to the feature graphs with different resolutions obtained by the backbone network before feature fusion to achieve the balance of the network structure. In the feature fusion part, a 13 × 13 × 1024 feature map has small resolution and large receptive field and has more high-level semantic information, the resolution of the feature map is improved through deconvolution operation, and feature maps of 26 × 26 × 256 and 52 × 52 × 128 are generated successively.
Step 2.2.2, performing feature fusion on the feature map after feature balance, namely performing deconvolution operation on the feature map after feature balance to realize upsampling of the feature map; and carrying out feature fusion on the feature map obtained by the up-sampling and the feature map generated by the improved darknet-53 backbone network to obtain a fusion feature map.
The 26 × 26 × 256 and 52 × 52 × 128 feature maps are fused with the 26 × 26 × 512 and 52 × 52 × 256 feature maps output by the backbone network respectively to realize auxiliary prediction.
Preferably, step 2.3 comprises:
step 2.3.1, in the characteristic diagram of each resolution, a plurality of anchor frames with different sizes are placed at each pixel position;
in the preferred embodiment of the present invention, 9 anchor frames of different sizes are placed at each pixel position of the feature map for each resolution.
Step 2.3.2, detecting the anchor frame based on a classification and regression algorithm; the detection comprises the following steps:
1) classifying branch detection to obtain the category of the anchor frame of each pixel point;
2) performing frame regression detection, and acquiring the coordinates of the central point of the target area and the offset of the width and the height of the target identification frame relative to the anchor frame;
and 2.3.3, performing convolution operation on the classification branch detection result and the frame regression detection result to extract detection characteristics, and taking the detection characteristics as a multi-target detection result of the transformer substation.
And the size of each anchor frame is obtained by clustering the target size of the data set of the multi-target detection of the transformer substation.
And 3, performing iterative training on the multi-target detection network of the transformer substation by using the data set of the multi-target detection of the transformer substation based on a transfer learning method.
Preferably, step 3 comprises:
step 3.1, performing data enhancement and expansion on a data set of the multi-target detection of the power station by using color gamut conversion, turning and image mirroring;
step 3.2, loading the pre-training weight parameters of the multi-target detection network;
and 3.3, performing iterative training on the multi-target detection network of the transformer substation by adopting an adam optimizer as a training optimizer based on the transfer learning method.
It should be noted that, in the preferred embodiment of the present invention, the training mode of the network and the parameter configuration of the learning rate optimization strategy are non-limiting preferred choices, and those skilled in the art can select the training mode of the network and the parameter configuration according to various indexes such as the detection accuracy and the detection efficiency.
In the preferred embodiment of the present invention, the initial value of the learning rate is set to 1 × 10-3The total training sample number is 2707, the batch processing sample number is set to 16, and the learning rate is changed to 0.92 times of the original learning rate every training period, so that 100 periods are trained.
And 3.2, controlling the overfitting risk of the multi-target detection network of the transformer substation in iterative training based on an L2 regularization method.
Step 3.2 comprises that the total period number of the training of the multi-target detection network is 100, and when the total period number of the training of the multi-target detection network is less than or equal to 50, the parameters of the trunk network of the darknet-53 are frozen for training; and when the training cycle number of the multi-target detection network is more than 50, training the whole multi-target detection network.
In the step 3.3, a data set of the multi-target detection of the transformer substation comprises category information and position information, and category detection values and position detection values are obtained through iterative training; comparing the class detection value with the class real value of the corresponding label, and comparing the position detection value with the position real value of the corresponding label to obtain class difference and position difference; and feeding back the category difference and the position difference to the multi-target detection network of the transformer substation in a gradient flow mode, and updating network parameters.
In the preferred embodiment of the invention, step 1 is to obtain the category information and the position information of all targets in each image sample, send the image samples into a network during model training, output the predicted target category and the position information of the targets after a series of operations such as convolution, pooling and the like, compare the predicted target category and the predicted target position information with the real category information and the position information of the corresponding labels, and feed back the difference to the network parameters in a gradient flow mode to update the network parameters. And as the samples are continuously input into the network model, the model parameters are continuously updated, and the prediction result of the network model gradually approaches the real result. The complete data set consists of raw image data and files with object type and location information. After the data set was constructed, the data set was randomly partitioned into a training set and a test set at a 9:1 ratio.
And 4, detecting the substation equipment by using the trained multi-target detection network.
Preferably, in step 4, the trained multi-target detection network is deployed on the test system, and the multi-target detection result of the substation equipment is obtained by taking the image data of the substation equipment as input.
In the preferred embodiment of the invention, the trained model is deployed on a test system, and the generalization result of the model on the target detection of the substation equipment is obtained by using the image data of the test set. Wherein the software of the test system is configured as python3.8, pytorch1.9.0, cuda11.2, pycharm 2020.3; the hardware configuration is intel core i7-11700K8 core 16 thread cpu, RTX-3070 display card (8G memory) x 2, 16G memory.
For the detection model obtained in the preferred embodiment of the invention, 311 pieces of image data containing multiple targets of the substation are used for testing. The Average detection Precision (Average Precision) and the total Average Precision (mAP) of all classes of each target are recorded, and the indexes are shown in Table 1.
A partial test effect picture is shown in fig. 5. According to the test result graph, the detection model can accurately identify the target in the test set through the test of the data set of the relevant equipment of the transformer substation, the positioning precision is high, and the target with high shielding performance can be accurately detected.
As can be seen from table 1, when the detection model of the embodiment is tested on a data set of related equipment of a transformer substation, the AP of the conservator is 93.47%, the AP of the radiator is 92.07%, the AP of the meter is 90.17%, the AP of the outgoing line bushing is 77.41%, the AP of the grading ring is 81.70%, the AP of the nameplate is 75.45%, the AP of the insulator is 72.36%, the AP of the isolating switch is 69.81%, and the AP of the transformer is 71.55%; respectively 1.01 percent higher, 0.93 percent lower, 1.96 percent higher, 0.32 percent higher, 1.4 percent higher, 2.48 percent higher, 1.91 percent higher, 5.43 percent higher and 35.75 percent higher than the original yolov3 model. Except that the identification precision of the instrument is reduced, the detection precision of other targets is obviously improved, and particularly, the detection precision of the transformer with the unbalanced number of poles is improved by 35.75 percent. The overall mAP was 80.44%, which is an improvement of 5.26%. The result shows that the model not only can greatly improve the multi-target detection precision of the transformer substation, but also can relieve the unbalance problem of the sample to a certain extent.
Table 1 comparison of test results and protocols of the model on the test set
Figure BDA0003415482170000121
In particular, the shape similarity between the outgoing line casing and the transformer target in table 1 is high, and the recognition difficulty and the accuracy are not high by adopting the original yolov3 network model. The method of the invention can obviously improve the identification precision of similar targets.
Compared with the prior art, the method has the advantages that the importance screening can be automatically carried out on different channels of the feature map during feature extraction, and the feature extraction capability of the network is enhanced; the characteristic map imbalance of the characteristic fusion can be relieved, and the prediction precision is improved; the problem that targets in the transformer substation are shielded from each other can be solved, and the probability of recognizing the shielded targets is increased; the interference of the complex environment of the transformer substation can be adapted, the end-to-end multi-target detection of the transformer substation is realized, and the detection result is directly output; better results can be obtained on large-scale data sets.
The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.

Claims (14)

1. A substation multi-target detection method based on attention mechanism and feature balance is characterized in that,
the method comprises the following steps:
step 1, obtaining image data samples of substation equipment, recording target information of each image data sample by using data annotation software, and constructing a data set for multi-target detection of a substation by using the target information;
step 2, establishing a multi-target detection network of the transformer substation based on a dark net-53 network and an attention mechanism, sequentially carrying out feature balance and feature fusion on each feature map by using the multi-target detection network to obtain a fusion feature map, and carrying out multi-branch detection on the fusion feature map to obtain a multi-target detection result of the transformer substation;
step 3, performing iterative training on the multi-target detection network of the transformer substation by using a data set of the multi-target detection of the transformer substation based on a transfer learning method;
and 4, detecting the substation equipment by using the trained multi-target detection network.
2. The substation multi-target detection method based on attention mechanism and feature balance of claim 1,
in the step 1, under different illumination, time and weather conditions, acquiring image data samples of transformer substation equipment; the data samples collected include: the transformer comprises a main transformer oil conservator, a main transformer outgoing line sleeve, a main transformer heat dissipation device, an insulator string, a mutual inductor, an equalizing ring, an isolating switch, a nameplate and an instrument panel;
the target information includes a category of the target area and a location of the target area.
3. The substation multi-target detection method based on attention mechanism and feature balance according to claim 1 or 2,
the step 2 comprises the following steps:
step 2.1, forming an improved darknet-53 backbone network based on the darknet-53 network and the attention mechanism, and generating feature maps with different resolutions by the improved darknet-53 backbone network;
step 2.2, carrying out feature balance and feature fusion on each feature map in sequence to obtain a fused feature map;
and 2.3, detecting the fused feature graph, and taking the obtained features as a multi-target detection result of the transformer substation.
4. The substation multi-target detection method based on attention mechanism and feature balance of claim 3,
step 2.1 comprises:
step 2.1.1, forming a residual error unit by connecting a plurality of groups of convolution blocks in series; the convolution block includes: convolution operation, batch normalization operation and activation function operation;
step 2.1.2, connecting different numbers of residual error units in series to form different residual error blocks;
step 2.1.3, forming a darknet-53 module by using the first residual block, the second residual block, the third residual block, the fourth residual block and the fifth residual block;
step 2.1.4, on the basis of the darknet-53 module, adding a mixed attention module into a residual block for generating a characteristic diagram for prediction to form an improved darknet-53 backbone network; feature maps of different resolutions are generated by the modified darknet-53 backbone network.
5. The substation multi-target detection method based on attention mechanism and feature balance of claim 4,
in step 2.1.4, the mixed attention module is added before the next convolution in the third, fourth and fifth residual blocks, which generate the feature map for prediction.
6. The substation multi-target detection method based on attention mechanism and feature balance according to claim 4 or 5,
the hybrid attention module comprises a spatial attention submodule and a channel attention submodule connected in series;
the channel attention submodule simultaneously uses maximum pooling and average pooling for preprocessing channel weight, namely, weight results after maximum pooling and average pooling are added, and the sum is normalized, and a normalized value is used as the weight of each channel of the original characteristic diagram.
7. The substation multi-target detection method based on attention mechanism and feature balance of claim 4,
step 2.2 comprises:
step 2.2.1, carrying out feature balance on each feature map, namely carrying out convolution operation on each feature map to ensure that the feature depths among the feature maps are consistent;
step 2.2.2, performing feature fusion on the feature map after feature balance, namely performing deconvolution operation on the feature map after feature balance to realize upsampling of the feature map; and carrying out feature fusion on the feature map obtained by the up-sampling and the feature map generated by the improved darknet-53 backbone network to obtain a fusion feature map.
8. The substation multi-target detection method based on attention mechanism and feature balance of claim 7,
step 2.3 comprises:
step 2.3.1, in the characteristic diagram of each resolution, a plurality of anchor frames with different sizes are placed at each pixel position;
step 2.3.2, detecting the anchor frame based on a classification and regression algorithm; the detection comprises the following steps:
1) classifying branch detection to obtain the category of the anchor frame of each pixel point;
2) performing frame regression detection, and acquiring the coordinates of the central point of the target area and the offset of the width and the height of the target identification frame relative to the anchor frame;
and 2.3.3, performing convolution operation on the classification branch detection result and the frame regression detection result to extract detection characteristics, and taking the detection characteristics as a multi-target detection result of the transformer substation.
9. The substation multi-target detection method based on attention mechanism and feature balance of claim 8,
and the size of each anchor frame is obtained by clustering the target size of the data set of the multi-target detection of the transformer substation.
10. The substation multi-target detection method based on attention mechanism and feature balance of claim 3,
the step 3 comprises the following steps:
step 3.1, performing data enhancement and expansion on a data set of the multi-target detection of the power station by using color gamut conversion, turning and image mirroring;
step 3.2, loading the pre-training weight parameters of the multi-target detection network;
and 3.3, performing iterative training on the multi-target detection network of the transformer substation by adopting an adam optimizer as a training optimizer based on the transfer learning method.
11. The substation multi-target detection method based on attention mechanism and feature balance of claim 10,
and 3.2, controlling the overfitting risk of the multi-target detection network of the transformer substation in iterative training based on an L2 regularization method.
12. The substation multi-target detection method based on attention mechanism and feature balance of claim 10,
step 3.2 comprises that the total period number of the training of the multi-target detection network is 100, and when the total period number of the training of the multi-target detection network is less than or equal to 50, the parameters of the trunk network of the darknet-53 are frozen for training; and when the training cycle number of the multi-target detection network is more than 50, training the whole multi-target detection network.
13. The substation multi-target detection method based on attention mechanism and feature balance of claim 10,
in the step 3.3, a data set of the multi-target detection of the transformer substation comprises category information and position information, and category detection values and position detection values are obtained through iterative training; comparing the class detection value with the class real value of the corresponding label, and comparing the position detection value with the position real value of the corresponding label to obtain class difference and position difference; and feeding back the category difference and the position difference to the multi-target detection network of the transformer substation in a gradient flow mode, and updating network parameters.
14. The substation multi-target detection method based on attention mechanism and feature balance of claim 10,
and 4, deploying the trained multi-target detection network on the test system, and taking the image data of the substation equipment as input to obtain a multi-target detection result of the substation equipment.
CN202111544623.1A 2021-12-16 2021-12-16 Substation multi-target detection method based on attention mechanism and feature balance Pending CN114241413A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111544623.1A CN114241413A (en) 2021-12-16 2021-12-16 Substation multi-target detection method based on attention mechanism and feature balance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111544623.1A CN114241413A (en) 2021-12-16 2021-12-16 Substation multi-target detection method based on attention mechanism and feature balance

Publications (1)

Publication Number Publication Date
CN114241413A true CN114241413A (en) 2022-03-25

Family

ID=80757453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111544623.1A Pending CN114241413A (en) 2021-12-16 2021-12-16 Substation multi-target detection method based on attention mechanism and feature balance

Country Status (1)

Country Link
CN (1) CN114241413A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114814776A (en) * 2022-06-24 2022-07-29 中国空气动力研究与发展中心计算空气动力研究所 PD radar target detection method based on graph attention network and transfer learning
CN115187603A (en) * 2022-09-13 2022-10-14 国网浙江省电力有限公司 Power equipment detection method and device based on deep neural network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114814776A (en) * 2022-06-24 2022-07-29 中国空气动力研究与发展中心计算空气动力研究所 PD radar target detection method based on graph attention network and transfer learning
CN114814776B (en) * 2022-06-24 2022-10-14 中国空气动力研究与发展中心计算空气动力研究所 PD radar target detection method based on graph attention network and transfer learning
CN115187603A (en) * 2022-09-13 2022-10-14 国网浙江省电力有限公司 Power equipment detection method and device based on deep neural network

Similar Documents

Publication Publication Date Title
CN114240878A (en) Routing inspection scene-oriented insulator defect detection neural network construction and optimization method
CN112200178B (en) Transformer substation insulator infrared image detection method based on artificial intelligence
CN114241413A (en) Substation multi-target detection method based on attention mechanism and feature balance
CN106504233A (en) Image electric power widget recognition methodss and system are patrolled and examined based on the unmanned plane of Faster R CNN
CN110070530A (en) A kind of powerline ice-covering detection method based on deep neural network
CN111767927A (en) Lightweight license plate recognition method and system based on full convolution network
CN104504395A (en) Method and system for achieving classification of pedestrians and vehicles based on neural network
CN112069940A (en) Cross-domain pedestrian re-identification method based on staged feature learning
CN106845434B (en) Image type machine room water leakage monitoring method based on support vector machine
CN110781882A (en) License plate positioning and identifying method based on YOLO model
CN112541393A (en) Transformer substation personnel detection method and device based on deep learning
Chen et al. Surface defect detection of electric power equipment in substation based on improved YOLOv4 algorithm
CN111639530A (en) Detection and identification method and system for power transmission tower and insulator of power transmission line
CN115147383A (en) Insulator state rapid detection method based on lightweight YOLOv5 model
CN117197763A (en) Road crack detection method and system based on cross attention guide feature alignment network
Li et al. Efficient detection in aerial images for resource-limited satellites
CN114898158A (en) Small sample traffic abnormity image acquisition method and system based on multi-scale attention coupling mechanism
CN116205905B (en) Power distribution network construction safety and quality image detection method and system based on mobile terminal
CN116681962A (en) Power equipment thermal image detection method and system based on improved YOLOv5
CN116612347A (en) Deep learning model training method based on examination room violations
CN110084852A (en) A kind of accurate positioning method of high iron catenary support device sleeve puller bolt
CN115661932A (en) Fishing behavior detection method
Wang et al. Component Detection of Overhead Transmission Line Based on CBAM-Efficient-YOLOv5
CN113936300A (en) Construction site personnel identification method, readable storage medium and electronic device
CN114049500A (en) Image evaluation method and system based on meta-learning reweighting network pseudo label training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination