CN115100549A - Transmission line hardware detection method based on improved YOLOv5 - Google Patents

Transmission line hardware detection method based on improved YOLOv5 Download PDF

Info

Publication number
CN115100549A
CN115100549A CN202210729380.7A CN202210729380A CN115100549A CN 115100549 A CN115100549 A CN 115100549A CN 202210729380 A CN202210729380 A CN 202210729380A CN 115100549 A CN115100549 A CN 115100549A
Authority
CN
China
Prior art keywords
network
hardware
transmission line
yolov5
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210729380.7A
Other languages
Chinese (zh)
Inventor
董凯
申庆斌
王承一
董彦武
刘秋月
李�杰
卢自强
宋建虎
王宏飞
卢自英
秦俊兵
何鹏杰
茹海波
孙红玲
邢闯
史丽君
张博
温玮
李冰
宋欣
郝剑
丁喆
贾金川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Super High Voltage Transmission Branch Of State Grid Shanxi Electric Power Co
Original Assignee
Super High Voltage Transmission Branch Of State Grid Shanxi Electric Power Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Super High Voltage Transmission Branch Of State Grid Shanxi Electric Power Co filed Critical Super High Voltage Transmission Branch Of State Grid Shanxi Electric Power Co
Priority to CN202210729380.7A priority Critical patent/CN115100549A/en
Publication of CN115100549A publication Critical patent/CN115100549A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a transmission line hardware detection method based on improved YOLOv5, which is characterized in that on the basis of original YOLOv5, deep separable convolution is used to simplify the parameter number and the calculated amount of a network, an extrusion and excitation channel attention module is used to enhance the feature extraction capability of a convolution block, and finally the number of filters of the convolution layer is further trimmed through a geometric median, so that the light weight degree of the network is greatly improved under the condition of ensuring the identification accuracy of the original network. According to the invention, YOLOv5 is applied to the hardware detection of the power transmission line, the requirements of real-time performance and accuracy of the edge end are met, and the robustness is good.

Description

Transmission line hardware detection method based on improved YOLOv5
Technical Field
The invention relates to the field of image target detection, in particular to a transmission line hardware detection method for improving YOLOv 5.
Background
The transmission line is used as a main carrier for transmitting electric energy at medium and long distances, metal accessories such as bare wires, conductors, insulators and the like for supporting, fixing and connecting are needed, the variety is various, and the appearance difference is obvious. The hardware damaged by corrosion and deformation is easy to cause large-scale power failure and serious economic loss, so that the realization of efficient detection of the hardware target is beneficial to automatically positioning the fault position of the defect, and the method has great significance for ensuring stable and safe operation of a power grid.
In recent years, with popularization and development of the internet of things and artificial intelligence technology, the power transmission line is subjected to patrol sampling detection through edge equipment such as an unmanned aerial vehicle, and then, an image sample obtained by analyzing the sample is automatically processed based on computer vision and image processing technology to become one of the main modes of patrol at present. However, at present, mature network model parameters are too many, the requirements on hardware computing resources are high, the detection speed is slow, the method cannot be directly applied to hardware samples shot on a power transmission line, the requirements on speed and precision of line inspection cannot be met, and huge network parameters and parameter quantities cannot be applied to edge equipment with limited hardware resources.
Therefore, under the background, the existing model is improved in light weight, the problem of unbalance between the detection speed and the precision of the transmission line hardware is solved, and the problem becomes a core topic of research and application.
Disclosure of Invention
The invention aims to provide a transmission line hardware detection method based on improved YOLOv5, which solves the problem of imbalance between identification precision and speed of an original method, further saves inference time and operation energy consumption required by detection and identification, designs and improves a new trunk and head network, further prunes the number of filters of a convolutional layer through a geometric median, and greatly improves the light weight degree of the network under the condition of ensuring the identification accuracy of the original network.
In order to achieve the purpose, the invention provides the following scheme:
a power transmission line hardware detection method based on improved YOLOv5 comprises the following steps:
s1, aiming at the hardware sample image of the power transmission line, constructing a hardware data set, performing data cleaning and labeling work, and making hardware image sets of different types and different scales;
s2, selecting YOLOv5 as a basic framework, and using the lightweight deep separable volume blocks as a cascade module of a backbone network and a fusion channel of a simplified tail network;
s3, introducing an extrusion expansion channel attention mechanism to improve the characteristic expression capability of the rolling blocks;
s4, unifying sample resolution, expanding data set scale through an image enhancement method, and improving network training effect;
s5, training the model by adopting a stochastic gradient descent method, and predicting through a target horizontal coordinate, a vertical coordinate, a width, a height, a prediction confidence coefficient and a classification result to obtain a detection result of the hardware fitting image;
and S6, calculating the importance degree of each filter based on the geometric median, removing the unimportant redundant channel parameters, and recovering the recognition precision by fine tuning training.
Further, in step S1, a transmission line hardware detection data set is constructed, data cleaning and labeling are performed, and hardware image sets of different types and different scales are manufactured, which specifically includes:
and (3) cleaning the power transmission line hardware image sampled and shot on the spot, and reserving a sample which is clear, has an obvious hardware exposure area and is reasonable in angle.
Further, in the step S2, selecting YOLOv5 as a basic architecture, and using the lightweight deep separable volume blocks as a cascade module of the backbone network and a convergence channel of the simplified tail network, specifically including:
firstly, removing a backbone network and a tail network of a YOLOv5 framework, then replacing the backbone network with a combination of depth separable convolution blocks, extracting spatial features by depth convolution, and fusing and scaling channel information by point convolution. Finally, the sixth, fourth and second depth separable convolution blocks and the seventh and eleventh layers of convolution of the network are selected to generate downsampled feature maps of 80 × 80, 40 × 40 and 20 × 20 respectively.
Further, in step S3, a stress-induced expansion channel attention mechanism is introduced to improve the feature expression capability of the volume block, which specifically includes:
constructing an extrusion expansion channel attention mechanism, introducing 2 convolution layers, keeping the number of input channels of a first layer and the number of output channels of the first layer consistent with the whole convolution block, reducing the number of channels by 4 times by the output channels of the first layer and the input channels of a second layer, setting the sizes of convolution kernels and sliding step lengths to be 1, and activating features by combining global average pooling and ReLU and SiLU functions, so as to calibrate the importance of extracting channel features of a depth separable convolution block, wherein the calculation formula is as follows:
X o =SiLU(Conv2(ReLU(Conv1(GAP(X i ))))·X i (1)
in the formula, X i And X o Representing the output characteristics of the volume block and the output characteristics after the channel attention mechanism is strengthened respectively, GAP is the global average weighted value of the whole feature map pixels, Conv1 and Conv2 correspond to the first and second layers of convolution operation respectively, ReLU and SiLU correspond to two activation functions, and the calculation process is as follows:
Figure BDA0003712411790000031
Figure BDA0003712411790000032
further, in step S4, unifying the sample resolution, expanding the data set scale by an image enhancement method, and improving the network training effect specifically includes:
firstly, counting the quantity distribution of the sizes of all images, then uniformly processing the resolution of the images into 640 multiplied by 640, splicing the images through random scaling, random cutting and random arrangement of four images, and finally integrating corresponding detection frame label information according to the effect of random splicing.
Further, in step S5, a stochastic gradient descent method is used to train the model, and the detection result of the hardware image is predicted through the target horizontal coordinate, the vertical coordinate, the width, the height, the prediction confidence and the classification result. The method specifically comprises the following steps:
firstly, inputting the hardware image and label information into a modified YOLOv5 network, and according to the classification loss L cls Positioning loss L loc And target confidence loss L conf Optimizing the weight value of network parameters, dividing a characteristic graph output by a network into KxK grids, predicting 3 anchor frames by each grid, wherein the total loss of the network is the loss accumulated sum of all grid anchor frames, and a loss calculation formula for a single anchor frame is as follows:
Figure BDA0003712411790000033
Figure BDA0003712411790000034
Figure BDA0003712411790000035
L conf =-c gt ·log(c)+(1-c gt )log(1-c) (7)
wherein N is the number of the types of the hardware fittings to be tested,
Figure BDA0003712411790000041
is the true label value, p, of the ith class of the sample i Is the predicted value of the network for the ith class, D p 、D L Respectively representing Euclidean distance between two central points of the network prediction frame and the real label frame and the length of a diagonal line of a minimum circumscribed rectangle of the network prediction frame and the real label frame, IOU is the ratio of intersection and union of areas of the network prediction frame and the real label frame, v is a parameter for measuring the consistency of the length-width ratio, w is a parameter for measuring the consistency of the length-width ratio gt 、h gt 、w p And h p Respectively representing the width and height of the real label box and the width and height of the network prediction box, c gt And c represents the confidence label of the object at the position of the network prediction box and the confidence of the object predicted by the network.
Then, evaluating the network training effect through average Precision average (mAP), Recall (Recall) and accuracy (Precision), wherein the calculation formula is as follows:
Figure BDA0003712411790000042
Figure BDA0003712411790000043
Figure BDA0003712411790000044
Figure BDA0003712411790000045
TP j 、FP j and FN j Respectively representing that the area intersection ratio of the network prediction frame and the real label frame is more than 0.5 and less than 0.5, and the number of the error identified as other types in the first j boundary frames of the model prediction is counted, and finally counting the obtained model parameter number and the floating point operation quantity (floating point operations)FLOPs) measure the degree of network lightweight.
Further, in step S6, calculating importance degrees of the filters based on the geometric median, removing unimportant redundant channel parameters, and recovering the recognition accuracy by fine tuning training, specifically including:
first, for a certain convolution layer, the parameter weight tensors of all filters are arranged in descending order of 1 norm. Then all layers of the network are equally divided into 7 parts, the pruning rate of each part is in arithmetic progression distribution, and the cumulative sum is ensured to be the set network parameter clipping proportion. And finally, removing a plurality of channels with the minimum Euclidean distance accumulation sum among the filters according to the pruning rate, and performing network forward propagation only by using the residual channel parameters to realize the simplification and compression of the final model.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention provides a transmission line hardware detection method based on improved YOLOv5, which comprises the steps of constructing a hardware data set, improving a head and a trunk network by adopting an improved YOLOv5 network as a basic detection model, reducing network parameters, integrating an extrusion and excitation channel attention module to enhance the feature extraction capability of a convolution block, and finally further trimming the number of filters of the convolution layer through a geometric median.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a diagram of the YOLOv5 algorithm;
FIG. 2 is a diagram of a modified YOLOv5 algorithm;
FIG. 3 is a diagram of a depth separable convolution block incorporating a channel attention mechanism;
fig. 4 is an effect diagram of the transmission line hardware detection.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A transmission line hardware detection method based on improved YOLOv5 mainly comprises a data acquisition processing part, a parameter reduction compression part and a network training testing part, wherein a YOLOv5 algorithm structure in the prior art is shown in figure 1, and a network structure after the improvement of YOLOv5 is shown in figure 2, so that the problems of huge parameter quantity and unbalanced model complexity and accuracy existing in an original YOLOv5 network are solved, and the purpose of reducing the network parameter quantity and calculated quantity as much as possible on the premise of meeting high detection accuracy is achieved.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention provides a more detailed description thereof with reference to the accompanying drawings and detailed description thereof, wherein:
s1, aiming at the hardware sample image of the power transmission line, constructing a hardware data set, performing data cleaning and labeling work, and making hardware image sets of different types and different scales;
s2, selecting YOLOv5 as a basic framework, and using the lightweight deep separable volume blocks as a cascade module of a backbone network and a fusion channel of a simplified tail network;
s3, introducing an extrusion expansion channel attention mechanism to improve the characteristic expression capability of the rolling blocks;
s4, unifying sample resolution, expanding data set scale through an image enhancement method, and improving network training effect;
s5, training the model by adopting a stochastic gradient descent method, and predicting the detection result of the hardware image through the target horizontal coordinate, the vertical coordinate, the width, the height, the prediction confidence coefficient and the classification result;
and S6, calculating the importance degree of each filter based on the geometric median, removing unimportant redundant channel parameters, and recovering the recognition precision by fine tuning training.
The deep learning model needs to be trained and optimized based on a large number of image data samples with labels, because the unmanned aerial vehicle acquires global images of the power transmission line in an aerial photographing mode, the samples need to be sub-sampled in key areas according to network input resolution, and in order to improve the training effect of the network on the samples, the samples containing hardware types need to be further cleaned and screened. Therefore, in step S1, a hardware data set is constructed for the power transmission line hardware sample image, data cleaning and labeling are performed, and hardware image sets of different types and different scales are manufactured, which specifically includes:
and cleaning the power transmission line hardware image shot by sampling on the spot, and reserving a clear sample containing an obvious hardware exposure area and a reasonable angle.
In the method, considering that the original Yolov5 network parameters are too much to meet the requirement of resource-limited equipment detection, the method introduces deep separable convolution to reduce the parameters and the calculated amount of the hardware detection network. In step S2, selecting YOLOv5 as a basic architecture, and using the lightweight deep separable volume block as a cascade module of the backbone network and a convergence channel of the simplified tail network specifically include:
firstly, removing a backbone network and a tail network of a YOLOv5 framework, then replacing the backbone network with a combination of depth separable convolution blocks, extracting spatial features by depth convolution, and fusing and scaling channel information by point convolution. Finally, the sixth, fourth and second depth separable convolution blocks and the seventh and eleventh layers of convolution of the network are selected to generate downsampled feature maps of 80 × 80, 40 × 40 and 20 × 20 respectively.
In order to compensate for the problem of reduced fitting ability caused by the depth separable convolution, the companding channel attention module and the depth separable convolution are cascaded, as shown in fig. 3, in step S3, an attention mechanism of the companding channel is introduced to improve the feature expression ability of the volume block, which specifically includes:
constructing an extrusion expansion channel attention mechanism in a convolution block to obtain a channel importance coefficient, multiplying the channel importance coefficient by an original feature map, introducing 2 convolution layers, keeping the number of input channels of a first layer and the number of output channels of a second layer consistent with the whole convolution block, reducing the number of channels by four times by the output channels of the first layer and the input channels of the second layer, setting the sizes of convolution kernels and sliding step lengths to be 1, and activating the features by combining global average pooling, ReLU and SiLU, so as to calibrate the importance of extracting channel features of a depth separable convolution block, wherein the calculation formula is as follows:
X o =SiLU(Conv2(ReLU(Conv1(GAP(X i ))))·X i (1)
in the formula, X i And X o Representing the output characteristics of the volume block and the output characteristics after the channel attention mechanism is strengthened respectively, GAP is the global average weighted value of the whole feature map pixels, Conv1 and Conv2 correspond to the first and second layers of convolution operation respectively, ReLU and SiLU correspond to two activation functions, and the calculation process is as follows:
Figure BDA0003712411790000071
Figure BDA0003712411790000072
in step S4, unifying sample resolution, expanding data set scale by an image enhancement method, and improving network training effect, specifically including:
firstly, the number distribution of the sizes of all the images is counted, then the image resolution is uniformly processed into 640 multiplied by 640, splicing is carried out through random scaling, random cutting and random arrangement of four images, and finally corresponding detection frame label information is integrated according to the effect of random splicing.
In the step S5, training a model by using a stochastic gradient descent method, and predicting a detection result of the hardware image by using a target horizontal coordinate, a vertical coordinate, a width, a height, a prediction confidence and a classification result, specifically including:
firstly, inputting the hardware image and label information into a modified YOLOv5 network, and according to the classification loss L cls Positioning loss L loc And target confidence loss L conf Optimizing the weight value of network parameters, dividing a characteristic graph output by a network into KxK grids, predicting 3 anchor frames by each grid, wherein the total loss of the network is the loss accumulated sum of all grid anchor frames, and a loss calculation formula for a single anchor frame is as follows:
Figure BDA0003712411790000081
Figure BDA0003712411790000082
Figure BDA0003712411790000083
L conf =-c gt ·log(c)+(1-c gt )log(1-c) (7)
wherein N is the number of the types of the hardware fittings to be tested,
Figure BDA0003712411790000084
is the true label value, p, of the ith class of the sample i Is the predicted value of the network for the ith class, D p 、D L Respectively representing Euclidean distance between two central points of the network prediction frame and the real label frame and diagonal length of minimum circumscribed rectangle of the network prediction frame and the real label frame, IOU is ratio of intersection and union of areas of the network prediction frame and the real label frame, v is parameter for measuring consistency of length-width ratio, w is length of the real label frame, and the IOU is a length of the minimum circumscribed rectangle of the network prediction frame and the real label frame gt 、h gt 、w p And h p Respectively representing the width and height of the real label box and the width and height of the network prediction box, c gt And c represent network prediction frame positions respectivelyPlacing confidence labels of the existing objects and the confidence that the network predicts the existing objects.
Then, evaluating the network training effect through average Precision average (mAP), Recall (Recall) and accuracy (Precision), wherein the calculation formula is as follows:
Figure BDA0003712411790000085
Figure BDA0003712411790000086
Figure BDA0003712411790000091
Figure BDA0003712411790000092
TP j 、FP j and FN j Respectively representing that the area intersection ratio of the network prediction frame and the real label frame is more than 0.5 and less than 0.5, and the number of the network prediction frame and the real label frame which are wrongly identified as other types in the first j boundary frames of the model prediction is counted, and finally, the obtained model parameters and floating point operations (FLOPs) are counted to measure the network lightweight degree.
In order to facilitate fitting convergence on a training sample and avoid parameter redundancy in a deep neural network, the channel pruning technique may further reduce the weight of the trained model, in step S6, the importance degree of each filter is calculated based on the geometric median, unimportant redundant channel parameters are removed, and the identification accuracy is restored by fine tuning training, specifically including:
first, for a convolution layer, the parameter weight tensors of all filters are arranged in descending order of 1 norm. Then all layers of the network are equally divided into 7 parts, the pruning rate of each part is in arithmetic progression distribution, and the cumulative sum is ensured to be the set network parameter clipping proportion. And finally, removing a plurality of channels with the minimum Euclidean distance accumulation sum between each layer of filter according to the pruning rate, and performing network forward propagation only by using the residual channel parameters to realize the simplified compression during model prediction.
The detection effect of the method of the present invention is shown in fig. 4. According to the method, on the basis of a YOLOv5 network, a trunk and head network is improved, the model volume is reduced, an extrusion and excitation channel attention module is introduced aiming at the defect of insufficient extraction capability of deep separable convolution characteristics, the number of filters of convolution layers is further pruned through a geometric median aiming at the redundancy of a trained model, and the lightweight degree of the network is greatly improved under the condition that the identification accuracy of the original network is ensured. The invention effectively improves the detection performance of the YOLOv5 network, ensures the detection precision and simultaneously improves the detection speed of various electric transmission line hardware fittings.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (7)

1. A power transmission line hardware detection algorithm based on improved YOLOv5 is characterized by comprising the following steps:
s1, aiming at the hardware sample image of the power transmission line, constructing a hardware data set, performing data cleaning and labeling work, and making hardware image sets of different types and different scales;
s2, selecting YOLOv5 as a basic framework, and using the lightweight deep separable volume blocks as a cascade module of a backbone network and a fusion channel of a simplified tail network;
s3, introducing an extrusion expansion channel attention mechanism to improve the characteristic expression capability of the rolling blocks;
s4, unifying sample resolution, expanding data set scale through an image enhancement method, and improving network training effect;
s5, training the model by adopting a stochastic gradient descent method, and predicting the detection result of the hardware image through the target horizontal coordinate, the vertical coordinate, the width, the height, the prediction confidence coefficient and the classification result;
and S6, calculating the importance degree of each filter based on the geometric median, removing unimportant redundant channel parameters, and recovering the recognition precision by fine tuning training.
2. The improved YOLOv 5-based transmission line hardware detection algorithm according to claim 1, wherein in step S1, a transmission line hardware detection data set is constructed, data cleaning and labeling are performed, and hardware image sets of different types and different scales are manufactured, specifically including:
and cleaning the power transmission line hardware image shot by sampling on the spot, and reserving a clear sample containing an obvious hardware exposure area and a reasonable angle.
3. The power transmission line hardware detection method based on the improved YOLOv5 of claim 1, wherein in step S2, YOLOv5 is selected as a basic architecture, the lightweight deep separable convolution blocks are used as a cascade module of a backbone network, and a fusion channel of a compact tail network, and specifically includes:
firstly removing backbone and tail networks of a YOLOv5 framework, then replacing the backbone networks with a combination of depth separable convolution blocks, extracting spatial features by depth convolution, fusing scaling channel information by point convolution, and finally selecting the sixth, fourth and second depth separable convolution blocks and the seventh and eleventh layers of convolution of the networks to respectively generate downsampled feature maps of 80 × 80, 40 × 40 and 20 × 20.
4. The power transmission line hardware detection method based on the improved YOLOv5 of claim 1, wherein in step S3, an attention mechanism of an extrusion expanding channel is introduced to improve the feature expression capability of the volume block, and the method specifically comprises:
constructing an extrusion expansion channel attention mechanism, introducing 2 convolution layers, keeping the number of input channels of a first layer and the number of output channels of the first layer consistent with the whole convolution block, reducing the number of channels by 4 times by the output channels of the first layer and the input channels of a second layer, setting the sizes of convolution kernels and sliding step lengths to be 1, and activating features by combining global average pooling and ReLU and SiLU functions, so as to calibrate the importance of extracting channel features of a depth separable convolution block, wherein the calculation formula is as follows:
X o =SiLU(Conv2(ReLU(Conv1(GAP(X i ))))·X i (1)
in the formula, X i And X o Representing the output characteristics of the volume block and the output characteristics after the channel attention mechanism is strengthened respectively, GAP is the global average weighted value of the whole feature map pixels, Conv1 and Conv2 correspond to the first and second layers of convolution operation respectively, ReLU and SiLU correspond to two activation functions, and the calculation process is as follows:
Figure FDA0003712411780000021
Figure FDA0003712411780000022
5. the power transmission line hardware detection method based on the improved YOLOv5 of claim 1, wherein in step S4, sample resolution is unified, data set size is expanded by an image enhancement method, and network training effect is improved, and specifically the method comprises:
firstly, counting the quantity distribution of the sizes of all images, then uniformly processing the resolution of the images into 640 multiplied by 640, splicing the images through random scaling, random cutting and random arrangement of four images, and finally integrating corresponding detection frame label information according to the effect of random splicing.
6. The method for detecting hardware of power transmission line based on improved YOLOv5 of claim 1, wherein in step S5, a stochastic gradient descent method is used to train a model, and a detection result of the hardware image is predicted according to a target horizontal coordinate, a vertical coordinate, a width, a height, a prediction confidence and a classification result, and specifically comprises:
firstly, inputting the hardware image and label information into a modified YOLOv5 network, and according to the classification loss L cls Positioning loss L loc And target confidence loss L conf Optimizing the weight value of network parameters, dividing a characteristic graph output by a network into KxK grids, predicting 3 anchor frames by each grid, wherein the total loss of the network is the loss accumulated sum of all grid anchor frames, and a loss calculation formula for a single anchor frame is as follows:
Figure FDA0003712411780000031
Figure FDA0003712411780000032
Figure FDA0003712411780000033
L conf =-c gt ·log(c)+(1-c gt )log(1-c) (7)
wherein N is the number of the types of the hardware fittings to be tested,
Figure FDA0003712411780000034
is the true label value, p, of the ith class of the sample i Is the predicted value of the network for the ith class, D p 、D L Respectively representing Euclidean distance between two central points of the network prediction frame and the real label frame and diagonal length of minimum circumscribed rectangle of the network prediction frame and the real label frame, IOU is ratio of intersection and union of areas of the network prediction frame and the real label frame, v is parameter for measuring consistency of length-width ratio, w is length of the real label frame, and the IOU is a length of the minimum circumscribed rectangle of the network prediction frame and the real label frame gt 、h gt 、w p And h p Respectively representing the width and height of the real label box and the width and height of the network prediction box, c gt And c represents the confidence label of the object at the position of the network prediction box and the confidence of the network predicting the object.
Then, evaluating the network training effect through average accuracy mean (mAP), Recall (Recall) and accuracy (Precision), wherein the calculation formula is as follows:
Figure FDA0003712411780000035
Figure FDA0003712411780000036
Figure FDA0003712411780000037
Figure FDA0003712411780000038
TP j 、FP j and FN j Respectively representing that the area intersection ratio of the network prediction frame and the real label frame is more than 0.5 and less than 0.5, and the number of the network prediction frame and the real label frame which are wrongly identified as other types in the first j boundary frames of the model prediction is counted, and finally, the obtained model parameters and floating point operations (FLOPs) are counted to measure the network lightweight degree.
7. The power transmission line hardware detection method based on the improved YOLOv5 of claim 1, wherein in step S6, the importance degree of each filter is calculated based on a geometric median, unimportant redundant channel parameters are removed, and the identification precision is restored by fine tuning training, which specifically comprises:
firstly, for a certain volume of lamination, arranging the parameter weight tensors of all filters according to a 1 norm descending order, then equally dividing all layers of the network into 7 parts, wherein the pruning rate of each part is in equal difference increasing number series distribution, the summation is ensured to be the set network parameter clipping proportion, finally, a plurality of channels with the minimum Euclidean distance summation between the filters of each layer are removed according to the pruning rate, and the network is transmitted forward only by the residual channel parameters, so that the simplified compression of a final model is realized.
CN202210729380.7A 2022-06-24 2022-06-24 Transmission line hardware detection method based on improved YOLOv5 Pending CN115100549A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210729380.7A CN115100549A (en) 2022-06-24 2022-06-24 Transmission line hardware detection method based on improved YOLOv5

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210729380.7A CN115100549A (en) 2022-06-24 2022-06-24 Transmission line hardware detection method based on improved YOLOv5

Publications (1)

Publication Number Publication Date
CN115100549A true CN115100549A (en) 2022-09-23

Family

ID=83292700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210729380.7A Pending CN115100549A (en) 2022-06-24 2022-06-24 Transmission line hardware detection method based on improved YOLOv5

Country Status (1)

Country Link
CN (1) CN115100549A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908999A (en) * 2022-11-25 2023-04-04 合肥中科类脑智能技术有限公司 Method for detecting corrosion of top hardware fitting of power distribution tower, medium and edge terminal equipment
CN115935263A (en) * 2023-02-22 2023-04-07 和普威视光电股份有限公司 Yoov 5 pruning-based edge chip detection and classification method and system
CN116612087A (en) * 2023-05-22 2023-08-18 山东省人工智能研究院 Coronary artery CTA stenosis detection method based on YOLOv5-LA

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908999A (en) * 2022-11-25 2023-04-04 合肥中科类脑智能技术有限公司 Method for detecting corrosion of top hardware fitting of power distribution tower, medium and edge terminal equipment
CN115935263A (en) * 2023-02-22 2023-04-07 和普威视光电股份有限公司 Yoov 5 pruning-based edge chip detection and classification method and system
CN116612087A (en) * 2023-05-22 2023-08-18 山东省人工智能研究院 Coronary artery CTA stenosis detection method based on YOLOv5-LA
CN116612087B (en) * 2023-05-22 2024-02-23 山东省人工智能研究院 Coronary artery CTA stenosis detection method based on YOLOv5-LA

Similar Documents

Publication Publication Date Title
CN109492830B (en) Mobile pollution source emission concentration prediction method based on time-space deep learning
CN115100549A (en) Transmission line hardware detection method based on improved YOLOv5
CN111582029B (en) Traffic sign identification method based on dense connection and attention mechanism
CN113392960B (en) Target detection network and method based on mixed hole convolution pyramid
CN114049545B (en) Typhoon intensity determining method, system, equipment and medium based on point cloud voxels
Fan et al. Predicting vacant parking space availability: a long short-term memory approach
CN113610778A (en) Bridge surface crack detection method and system based on semantic segmentation
CN112149887A (en) PM2.5 concentration prediction method based on data space-time characteristics
CN116503318A (en) Aerial insulator multi-defect detection method, system and equipment integrating CAT-BiFPN and attention mechanism
CN110634127A (en) Power transmission line vibration damper target detection and defect identification method and device
CN114612803B (en) Improved CENTERNET transmission line insulator defect detection method
CN116432870B (en) Urban flow prediction method
CN114021741A (en) Photovoltaic cell panel inspection method based on deep learning
CN117891005A (en) Ground wire icing prediction method and system based on microclimate factor time sequence
CN113642255A (en) Photovoltaic power generation power prediction method based on multi-scale convolution cyclic neural network
CN115937079A (en) YOLO v 3-based rapid detection method for defects of power transmission line
CN114170581B (en) Anchor-Free traffic sign detection method based on depth supervision
CN116416237A (en) Power transmission line defect detection method based on improved YOLOv5 and fuzzy image enhancement
CN115032719A (en) Air quality prediction method based on machine learning LightGBM algorithm
CN114897858A (en) Rapid insulator defect detection method and system based on deep learning
CN114359167A (en) Insulator defect detection method based on lightweight YOLOv4 in complex scene
Son et al. Partial convolutional LSTM for spatiotemporal prediction of incomplete data
CN109993282A (en) A kind of typhoon involves the prediction technique of range
CN113053123B (en) Traffic prediction method and device based on space-time big data
CN115100546A (en) Mobile-based small target defect identification method and system for power equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination