CN115410087A

CN115410087A - Transmission line foreign matter detection method based on improved YOLOv4

Info

Publication number: CN115410087A
Application number: CN202211052529.9A
Authority: CN
Inventors: 朱傥; 杨忠; 薛八阳; 张驰; 李国涛; 廖禄伟; 杨欣
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2022-08-30
Filing date: 2022-08-30
Publication date: 2022-11-29

Abstract

The invention discloses a foreign matter detection method for a power transmission line based on improved YOLOv4, which comprises the following steps: collecting a power transmission line patrol video, performing framing processing on the power transmission line patrol video, and performing data cleaning on a scene picture which is consistent with the condition that the power transmission line is attached with foreign matters; then, labeling is carried out, and a foreign matter detection data set in a power transmission line scene is constructed; constructing an improved YOLOv4 network model; training the improved YOLOv4 network model based on a data set, and storing the trained weight and the hyper-parameter; and detecting the test set picture by using the saved and trained weight to obtain a detection result of the foreign matter image of the power transmission line. The invention greatly reduces the network parameters, improves the detection precision, has good robustness, can effectively improve the foreign object identification capability of the unmanned aerial vehicle, can be more efficiently applied to embedded equipment such as a mobile terminal and the like, and meets the detection requirement of the unmanned aerial vehicle on the foreign object target of the power transmission line in the process of routing inspection and obstacle removal.

Description

Transmission line foreign matter detection method based on improved YOLOv4

Technical Field

The invention relates to a target detection technology, in particular to a transmission line foreign matter detection method based on improved YOLOv 4.

Background

The method has remarkable significance for guaranteeing the safe, stable and efficient operation of the power transmission line in the electric energy transmission process. In the years, the foreign matters of the power transmission line are generated due to the artificial activities of kitting, balloons and the like, so that the events influencing the stable operation of the power system are inexhaustible, the serious even tripping accidents are caused, and part of the power system in the area is paralyzed. The foreign matters hung on the power transmission line mainly comprise wastes such as kites, balloons and plastic films, and nests built by birds on a tower. How to discover the foreign matters in time becomes an important subject of the intelligent inspection task of the power transmission line.

The methods for detecting foreign matters in the power transmission line are divided into two types: traditional artificial feature extraction methods and deep learning based methods. The traditional method judges whether foreign matters exist or not by detecting the outline of the transmission conductor, the detection effect is influenced by factors such as complex background, noise and the like, and the precision is low. In recent years, deep learning becomes a hotspot technology of well-known people and gets rid of some disadvantages caused by traditional machine learning. The neural network is not easily influenced by factors such as geometric transformation, deformation and illumination of a detection target, and the identification difficulty caused by deformation generated when foreign matters are attached to the power line is reduced. And the method can automatically generate the characteristics corresponding to the detected target, avoids the complexity of manually designing the characteristics, and has obvious advantages compared with the traditional target detection algorithm. The target detection algorithm based on deep learning is divided into a first stage and a second stage, the two-stage algorithm needs to generate a large number of candidate boxes, the calculated amount is huge, the detection speed is slow, the degree of real-time detection cannot be achieved in actual industrial application, and the representative of the algorithm is fast R-CNN. And a candidate box is not generated in the one-stage algorithm, and the category confidence coefficient and the position coordinate of the target are directly regressed, so that the higher speed can be obtained, and the algorithm is represented by SSD and YOLO series algorithms. YOLOv4 is the current mainstream target detection algorithm and has a good detection effect. However, the model volume of YOLOv4 is still large, and in order to obtain high precision, the quantity of model parameters is huge, and it is difficult to actually deploy on an embedded device with limited storage space.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to provide a power transmission line foreign matter detection method based on improved YOLOv4, and aims to reduce the parameter quantity of a model, improve the robustness of the model, obtain higher precision and speed, realize the performance improvement and real-time detection, and be more favorable for being carried on embedded equipment with limited storage space and computing capacity, such as a mobile terminal and the like.

The technical scheme is as follows: the invention discloses a foreign matter detection method for a power transmission line based on improved YOLOv4, which comprises the following steps:

collecting a power transmission line inspection video, performing framing processing on the power transmission line inspection video, and performing data cleaning on scene pictures which conform to the condition that the power transmission line is attached with foreign matters;

performing tagging processing on the scene picture which accords with the condition that the power transmission line is attached with the foreign matter after data cleaning, constructing a foreign matter detection data set under the power transmission line scene based on the scene picture which accords with the condition that the power transmission line is attached with the foreign matter after the tagging processing, and dividing a training set, a verification set and a test set;

constructing an improved YOLOv4 network model;

training the improved YOLOv4 network model based on a training set, verifying the trained improved YOLOv4 network model based on a verification set, and storing the weight and the hyper-parameter with the highest detection precision on the verification set;

and detecting the test set picture by using the stored weight to obtain a detection result of the foreign matter image of the power transmission line.

Further, the method for performing data cleaning on the scene picture which is in line with the foreign matter on the power transmission line comprises the following steps:

all scene pictures which conform to the condition that the power transmission line is attached with foreign matters are converted into a jpg or png format;

carrying out data enhancement and data set expansion on the scene picture which is subjected to format conversion and accords with the condition that the foreign matter is attached to the power transmission line, wherein the data set expansion comprises horizontal turning, color gamut conversion, size scaling and mosaic data enhancement methods, and a foreign matter image is obtained;

and embedding the foreign object image into the power transmission line background image, wherein the foreign object image is smaller than the power transmission line background image.

Further, the method for labeling the scene picture which is subjected to data cleaning and conforms to the condition that the power transmission line is attached with the foreign matter comprises the following steps:

labeling each picture by using a labellimg labeling tool to form a corresponding xml label file, wherein the xml label file is in a format of PASCAL VOC and comprises two diagonal coordinates of a rectangular target in the picture and a given category.

Further, the construction method of the improved YOLOv4 network model comprises the following steps:

a YOLOv4 network model is taken as a basic model;

replacing a backbone network CSPDarkNet53 of the YOLOv4 network model with a lightweight GhostNet to perform feature extraction;

improving a feature pyramid pooling SPP module of a YOLOv4 network model into an SPPF module;

replacing convolutional layers with convolution kernel sizes of 3 x 3 in three-layer convolutional blocks and five-layer convolutional blocks of a YOLOv4 network model and downsampling layers and prediction layers by depth separable convolutional layers;

inserting an ECA module after an up-sampling layer and a down-sampling layer of a YOLOv4 network model;

replacing a Leaky ReLU activation function in the common convolution with a SiLU activation function by applying a path aggregation network and a prediction layer part of a YOLOv4 original algorithm;

generating six anchor point frames with small sizes in a foreign matter detection data set under the scene of adapting to the power transmission line by applying a k-means clustering algorithm, and combining three default anchor point frames with large sizes to form the size of the final nine foreign matter anchor point frames;

and finally obtaining an improved YOLOv4 network model.

The invention relates to a foreign matter detection system of a power transmission line based on improved YOLOv4, which comprises:

the data acquisition module is used for acquiring a power transmission line inspection video;

the data processing module is used for performing framing processing on the acquired aerial video of the power transmission line and performing data cleaning on the scene picture which is in line with the foreign matter on the power transmission line;

the marking module is used for performing labeling processing on the scene pictures which accord with the condition that the foreign matters are attached to the power transmission line after data cleaning, constructing a foreign matter detection data set under the power transmission line scene based on the scene pictures which accord with the condition that the foreign matters are attached to the power transmission line after the labeling processing, and dividing a training set, a verification set and a test set;

the model construction module is used for constructing an improved YOLOv4 network model;

the model training and verifying module is used for training the improved YOLOv4 network model based on a training set, verifying the trained improved YOLOv4 network model based on a verifying set and storing the weight and the hyper-parameter with the highest detection precision on the verifying set;

and the test module is used for detecting the test set pictures by using the stored weight to obtain a detection result of the foreign matter image of the power transmission line.

Preferably, the process of performing data cleaning on the scene picture which is in line with the foreign object on the power transmission line is as follows:

carrying out data enhancement and data set expansion on the scene picture which is subjected to format conversion and accords with the condition that the foreign matter is attached to the power transmission line, wherein the data set expansion comprises a horizontal turning method, a color gamut conversion method, a size scaling method and a mosaic data enhancement method, and a foreign matter image is obtained;

Preferably, the process of tagging the scene picture which accords with the condition that the foreign matter is attached to the power transmission line after the data is cleaned is as follows:

Preferably, the construction process of the improved YOLOv4 network model is as follows:

a YOLOv4 network model is taken as a basic model;

replacing convolutional layers with convolution kernel size of 3 multiplied by 3 in a three-layer convolutional block and a five-layer convolutional block of a YOLOv4 network model and a downsampled layer and a prediction layer with a depth separable convolutional layer;

applying a k-means clustering algorithm to generate six anchor point frames which are adapted to the small and medium sizes of the self-made data set, and combining three default anchor point frames with large sizes to form the size of the final nine foreign-matter anchor point frames;

and finally obtaining an improved YOLOv4 network model.

An apparatus of the present invention includes a memory and a processor, wherein:

a memory for storing a computer program capable of running on the processor;

and the processor is used for executing the steps of the power transmission line foreign matter detection method based on the improved YOLOv4 when the computer program is run.

A storage medium of the present invention, which stores a computer program, when the computer program is executed by at least one processor, implements the steps of the above-mentioned power transmission line foreign object detection method based on improved YOLOv 4.

Has the beneficial effects that: compared with the prior art, the invention has the following remarkable technical effects: (1) According to the invention, a YOLOv4 network model is improved, the detection precision of targets, especially small targets, is improved, the parameter number of the model is reduced to 17% of that of the original model, the detection speed is improved to a certain extent, and the model can be more efficiently carried on embedded equipment to complete the intelligent routing inspection and obstacle clearing task. (2) The GhostNet lightweight backbone network adopted by the invention has less parameter quantity, but the feature extraction capability is still very strong, so that the requirement of a foreign matter detection task of a power transmission line can be met. (3) Compared with the original module, the SPPF module adopted by the invention has higher efficiency, can improve the running speed and accelerate the convergence of the model. (4) The depth separable convolution adopted by the invention can reduce the parameter quantity to one eighth of the common convolution, but the detection precision is only slightly reduced, thereby being more beneficial to industrial-grade application. (5) The ECA channel attention mechanism introduced by the invention is a light weight module, the characteristic characterization capability of the network can be improved, the detection precision is improved, and the added parameter quantity can be completely ignored. (6) The SiLU activation function adopted by the method is smoother, the performance can be improved, the effect is better than that of a Leaky ReLU function, and the advantages are more obvious especially in a deep network. (7) The adaptive anchor point frame is generated by adopting k-means clustering, and the selection mode of the anchor point frame of the data set has certain reference significance for the selection of the corresponding target anchor point frame in other target detection fields. (8) The final model of the invention is also suitable for other target detection fields, and can be expanded to industrial end deployment in other fields.

Drawings

FIG. 1 is a diagram of a YOLO4 network model architecture;

FIG. 2 is a diagram of an improved YOLOv4 network model architecture;

FIG. 3 is a flow chart of the method of the present invention;

FIG. 4 is a Ghost module in a feature extraction network, ghostNet;

fig. 5 is a Ghost bottleneck structure in a feature extraction network Ghost net;

FIG. 6 is a structural diagram of an SPPF module replacing an original SPP module;

FIG. 7 is a diagram of a general convolution structure;

FIG. 8 is a diagram of a depth separable convolution;

FIG. 9 is a block diagram of an ECA (Efficient Channel Attention) Attention mechanism module

FIG. 10 is a graph of a SiLU (signed Weighted Linear Unit) activation function;

fig. 11 is an illustration of a part of experimental samples, in which (a) and (b) are normal samples, (c) and (d) are samples of kite attached to power transmission line, (e) and (f) are samples of balloon attached to power transmission line, (g) and (h) are samples of trash attached to power transmission line, and (i) and (j) are samples of bird nest hung on power transmission line;

fig. 12 is a comparison graph of detection effects of different models, where (a) to (c) are graphs of detection effects of the YOLOv4 network, the YOLOv5 network, and the improved YOLOv4 network, respectively, in a foreign matter scene of a power transmission line with a balloon; (d) The detection effect of the YOLOv4 network, the detection effect of the YOLOv5 network and the detection effect of the improved YOLOv4 network under the power line foreign matter scene of the kite are respectively shown in a diagram; (g) (ii) a detection effect of a YOLOv4 network, a detection effect of a YOLOv5 network and a detection effect diagram of an improved YOLOv4 network under the scene of a foreign object of a transmission line, which is garbage respectively; (j) Respectively showing the detection effect of the YOLOv4 network, the detection effect of the YOLOv5 network and the detection effect of the improved YOLOv4 network under the foreign matter scene of the power transmission line of the bird nest; and (m) to (o) are detection effects of a YOLOv4 network, a YOLOv5 network and an improved YOLOv4 network in the power transmission line scene of the supplemented small target data set respectively.

Detailed Description

The technical scheme of the invention is further explained by combining the attached drawings.

A foreign matter detection method for a power transmission line based on improved YOLOv4 comprises the steps of data set preparation, data preprocessing and network training and prediction. The structure diagram of a YOLOv4 network model in the prior art is shown in fig. 1, and the structure diagram of the improved YOLOv4 network model disclosed by the invention is shown in fig. 2 and can be divided into three parts, namely feature extraction of a backbone network, feature fusion of a feature pyramid and a path aggregation network and prediction of a probe. The input pictures with the size of 416 multiplied by 416 enter a backbone network (GhostNet) to carry out layer-by-layer feature extraction, when the number of output channels reaches 40,116,160 for the first time, namely, reaches the 4 th, 10 th and 12 th layers of the GhostNet, the three layers of feature maps are output to a feature fusion network, and the sizes of the feature maps are (52, 52,40), (26, 26,112), (13, 13,160) and respectively correspond to three sizes, namely, a small size, a medium size and a large size. And then, performing upper and lower layer feature fusion through a feature pyramid (SPPF) structure and a path aggregation network, outputting the fused features into three detecting heads with corresponding sizes, performing classification regression and coordinate regression respectively, and outputting the target type and the target coordinate in the original image.

As shown in fig. 3, the method for detecting the foreign matter in the power transmission line based on the improved YOLOv4 of the present invention includes the following specific steps:

s1, collecting power transmission line patrol videos through an unmanned aerial vehicle, performing framing processing on the collected power transmission line aerial videos, and performing data cleaning on scene pictures which conform to the condition that foreign matters are attached to the power transmission line;

decomposing the power transmission line inspection video shot by the unmanned aerial vehicle frame by frame, and screening out scene pictures which accord with the condition that foreign matters are attached to the power transmission line;

converting all scene pictures which conform to the condition that the power transmission line is attached with foreign matters into a jpg or png format, and adjusting the picture size to 416 multiplied by 416 pixel size;

carrying out data enhancement and data set expansion on the scene picture which is subjected to size adjustment and accords with the condition that the power transmission line is attached with the foreign matter, wherein the data set expansion comprises a horizontal turning method, a color gamut conversion method, a size scaling method and a mosaic data enhancement method, and a foreign matter image is obtained;

in order to improve the generalization and the detection performance of the foreign object image, the foreign object image is embedded into the background image of the power transmission line by adopting the PS technology, so that the data set is further enriched, the training capability of the model on the small object is enhanced, and the background image of the power transmission line is ensured to be obviously larger than the foreign object image in size.

In this embodiment, the foreign object image is 250 × 250 pixels, and the power transmission line background image is 5472 × 3078 pixels, so the foreign object image of 250 × 250 pixels is embedded in the power transmission line background image of 5472 × 3078 pixels; specifically, balloons, kites and the like in daily activities are twisted to a certain degree and attached to different positions of a power transmission line in different sizes, foreign matters which are hung on the power transmission line and generate certain deformation are simulated, and the backgrounds of the foreign matters are real power transmission line scenes.

S2, tagging the scene picture which accords with the condition that the power transmission line is attached with the foreign matter after the data is cleaned, constructing a foreign matter detection data set under the power transmission line scene based on the tagged scene picture which accords with the condition that the power transmission line is attached with the foreign matter, and dividing a training set, a verification set and a test set;

labeling each picture by using a labellimg labeling tool to form a corresponding xml label file, wherein the file format is PASCAL VOC and comprises two diagonal coordinates of a rectangular target in the picture and a given category;

based on a constructed foreign matter detection data set under the power transmission line scene, a training set, a verification set and a test set are constructed according to the proportion of 7.

S3, constructing an improved YOLOv4 network model;

a YOLOv4 network model is taken as a basic model;

the GhostNet lightweight backbone network adopted by the invention has less parameter quantity, but the feature extraction capability is still very strong, so that the requirement of a foreign matter detection task of a power transmission line can be met. The network comprises a large number of Ghost modules and two Ghost bottleneck structures with step sizes of 1 and 2, as shown in fig. 4 and 5. The Ghost module is composed of two operations, the first is to generate a feature map 1 without redundancy through ordinary convolution, and the second is to generate a complete feature map through identity (identity transform) and cheap linear operation phi. The high efficiency is obtained by cheap linear operation with low cost. Both the ghost bottleneck structures with

step sizes

1 and 2 comprise two ghost modules, the first of which is used to increase the number of channels and the second of which is used to decrease the number of channels, eventually matching it to the input channel number. An SE attention mechanism module is arranged in the middle for enhancing the feature extraction capability. The Shortcut is a Shortcut branch and is used for reserving an input feature layer, and feature fusion is conveniently carried out on the feature graphs of the two branches to obtain final output. The specific network configuration is shown in table 1.

Table 1 GhostNet network configuration table

The feature pyramid pooling (SPP) module of the YOLOv4 original algorithm is applied, and the original module SPP is replaced by a faster feature pyramid pooling (SPPF) module based on the original algorithm, as shown in fig. 6. While the original SPP module uses a 5 × 5,9 × 9,13 × 13 largest pooling kernel, the improved SPPF module achieves the same result by stacking a different number of 5 × 5 largest pooling kernels. Spatial feature information of different sizes can be extracted through the feature pyramid pooling module during input, robustness of the model to spatial layout and object degeneration is improved, and meanwhile unified fixed output of the input of different sizes can be guaranteed.

The SPPF module sets a mainline channel to pass the input serially through a plurality of 5 x 5 max pooled cores. The first branch is directly connected to the output, equivalent to a 1 x 1 max pooling core. The second branch passes through a 5 x 5 convolution kernel and then connects to the output, equivalent to a 5 x 5 maximal pooling kernel. The third branch passes through two 5 × 5 pooling cores and then connects to the output, equivalent to a 9 × 9 maximal pooling core. The last branch passes through three 5 × 5 pooling cores and is then connected to the output, equivalent to a 13 × 13 maximal pooling core. The outputs of the four branches are finally stacked.

The convolution layer with convolution kernel size of 3 x 3 in the three-layer convolution block and the five-layer convolution block of the YOLOv4 network model and the downsampling layer and the prediction layer is replaced by the depth separable convolution layer, the common convolution is shown in FIG. 7, the depth separable convolution is shown in FIG. 8 and is composed of two parts of channel-by-channel convolution and point-by-point convolution, the channel-by-channel convolution is specifically that one convolution kernel is responsible for one channel, one channel is convolved by only one convolution kernel, and the number of characteristic diagram channels generated by the process is identical to the number of input channels. Each channel of the input layer is independently subjected to convolution operation, and feature information on the same spatial position among different channels is not effectively utilized, so that the feature maps need to be combined through point-by-point convolution to generate a new feature map. On the premise that the same input is obtained by the same input, the parameter quantity of the deep separable convolution is only 1/3 of that of the conventional convolution, so that the parameter quantity of the model can be reduced by adopting the deep separable convolution, and the size of the model is smaller.

Considering that the parameters of the model mostly come from convolutional layers, attempts are made to further reduce the model parameters by using the deep separable convolution which is most commonly used in lightweight networks and has better effect. The 15-layer 3 × 3 convolutional layers in total are replaced by replacing all 3 × 3 convolutional layers of the three-layer convolutional blocks, the five-layer convolutional blocks, and the downsampled layer and the prediction layer with depth-separable convolutional layers.

Inserting an ECA (iterative channel association) module after an up-sampling layer and a down-sampling layer of a YOLOv4 network model;

an ECA module is inserted after two upper sampling layers and two lower sampling layers of a network model, and after sampling, the ECA module can focus on a channel with more effective information through the attention mechanism module, so that interference of part of useless information can be filtered, and the information fusion effect is better. Since there is another branch with fused features stacked with it, there is no fear of loss of valid information. The ECA module structure is shown in FIG. 9, and it is considered that the prediction of the channel attention is negatively affected by the dimension reduction operation, and the obtaining of the dependency relationship is inefficient and unnecessary; based on the above, an Efficient Channel Attention (ECA) module for CNN is provided, which avoids dimension reduction and effectively realizes cross-channel interaction. This is achieved by a fast one-dimensional convolution of size k, where the kernel size k represents the coverage of local cross-channel interaction, i.e. how large a range participates in the attention prediction of one channel.

By applying the path aggregation network and the prediction layer part of the YOLOv4 network model, a SiLU (signed Weighted Linear Unit) activation function is adopted to replace an leakage ReLU activation function of the original algorithm, and a function curve is shown in FIG. 10.

Inputting a characteristic diagram into a path aggregation network, and then entering a prediction layer, wherein an activation function adopts a SiLU in the process, and the formula is as follows:

f(x)＝x·sigmoid(x)

applying a k-means clustering algorithm to generate nine anchor point boxes which are adapted to the self-made data set;

nine anchor point frames are generated by adopting a k-means clustering algorithm and respectively correspond to a large scale, a medium scale and a small scale; the original anchor point frame and the anchor point frame generated after k-means clustering are adopted in the comparison experiment analysis of the table 2, six anchor point frames with small and medium sizes generated by k-means clustering are provided, and the large-size anchor point frame value in the original anchor point frame is still adopted for the large-size anchor point frame.

TABLE 2 comparison experiment of initial anchor boxes

Finally, an improved YOLOv4 network model is obtained, and the structure of the improved YOLOv4 network model is shown in FIG. 2.

S4, training the improved YOLOv4 network model based on the data set, and storing the trained weight; training the improved YOLOv4 network model based on a training set, wherein a verification set is used for model selection and parameter adjustment, and finally trained weights are stored;

(5) And detecting the test set picture by using the stored weight to obtain a detection result of the foreign matter image of the power transmission line.

the data acquisition module is used for acquiring the power transmission line inspection video;

a memory for storing a computer program operable on the processor;

and the processor is used for executing the steps of the power transmission line foreign matter detection method based on the improved YOLOv4 when the computer program is run, and can achieve the technical effects consistent with the method.

The storage medium of the present invention stores thereon a computer program, which when executed by at least one processor, implements the steps of the above-mentioned method for detecting foreign matters in an electric transmission line based on improved YOLOv4, and can achieve technical effects consistent with the above-mentioned method.

Example (b):

the foreign object detection data set in the scene of the power transmission line of the embodiment contains 4496 sample data, which are divided into four types, namely nest 2541, balloon 704, kite 691 and hash 643. The training set, the verification set and the test set are divided according to the proportion of a data set 7.

In this embodiment, the building, training, and result testing of the model are all completed under the pytorech frame, and the cudnn acceleration library is integrated into the computing power of the acceleration computer under the pytorech frame by using the CUDA parallel computing architecture.

(1) Predicting result performance evaluation indexes;

the AP (Average Precision) is used for calculating the area under a certain type of P-R curve, and refers to the area size enclosed by the horizontal and vertical coordinates of the curve which are respectively the accuracy and the recall rate. The mAP (Mean Average Precision) is the Average value of the areas under the P-R curves of all classes, i.e. the Average value of the APs corresponding to all classes, and the calculation formula is as follows, where C is the total number of classes.

Wherein, AP _i The AP value of the i-th class of target is shown, the AP is the total value/total number of classes of AP added by the respective APs of all classes, i.e. their average AP, and the FPS (Frame Per Second) is used to evaluate the speed of target detection, i.e. the number of pictures that can be processed Per Second or the time required to process one picture to evaluate the detection speed, the shorter the time, the faster the speed.

(2) Improving the model ablation experiment;

in the improved YOLOv4 network model training, the smaller the loss value of the loss function loss of the model structure, the better the loss value, and the expectation value is 0. In the process of improving the network by various modes, the performance improvement of the model is reflected by performance evaluation indexes such as mAP, FPS, parameters, model volume and the like. The size of the input picture is adjusted to 416 × 416, the batch processing size is 8, the optimizer adopts Adam, the cosine annealing attenuation algorithm is adopted to change the current learning rate in stages, the initial learning rate is set to 0.001, the minimum learning rate is set to 0.00001, and the cosine period is set to 5. The smooth label size is set to 0.005 and the weight attenuation size is set to 0.0005. 100 epochs (generations) were iteratively trained. And inputting the verification set picture into a network, and calculating the current training performance of the model once every 10 epochs to obtain the current detection accuracy (mAP) of the model and the AP value of each type of target.

For the YOLOv4 original model, the embodiment adjusts some tricks to the best condition by parameter adjustment, and then uses the original model as a reference, and adds different changes to expect the model to obtain better performance. The final results are shown in table 3. In the embodiment, a currently mainstream target detection model YOLOv4 is selected as an initial model a, and a lightweight backbone network GhostNet is adopted to replace an original backbone network to obtain a model (B), wherein the mAP of the model (B) is 97.07%, which is improved by 1.37% compared with the original YOLOv4 model (a), and it is proved that the lightweight network can be converged faster and fit more easily than a deeper complex network in a relatively simple detection task. Meanwhile, the parameter quantity is far less than that of the original YOLOv4 model, the reasoning speed is improved to a certain extent, and the memory occupation is reduced to 61.5% of that of the original model. And the faster characteristic pyramid pooling module SPPF is adopted to replace the original module SPP, so that the running speed and the convergence speed of the model are increased. After the module is used, the reasoning speed of the model (C) is further improved, the mAP of the model is slightly improved, and the parameter quantity and the memory occupation are unchanged. And the most commonly used and effective deep separable convolution of a lightweight network is adopted, so that the parameter quantity of the model is further reduced. The parameter quantity of the model (D) is reduced to one fourth of that of the original model (C), the detection speed is slightly improved, but the mAP of the model is reduced by 1.17%. An ECA module is inserted after two up-sampling layers and two down-sampling layers of the improved YOLOv4, so that a channel with more effective information can be focused through the attention mechanism module after sampling, and therefore interference of part of useless information can be filtered out, and the information fusion effect is better. Since there is another branch with fused features stacked with it, there is no fear of loss of valid information. In addition, the Leaky ReLU is replaced with a SiLU activation function. The detection speed of model (E) decreased slightly, but the mAP increased by 0.74%. Aiming at the target detection under the specific background environment, a k-means clustering algorithm is adopted, after the sample size in the data set is comprehensively considered, in order to enhance the detection capability of the network on the small target, a middle-small-size anchor point frame generated by clustering is reserved, the original size of the anchor point frame adopted by the COCO data set is still used for the large size, and finally the mAP of the network (F) is increased by 0.39% without any performance loss.

TABLE 3 ablation experiments to improve YOLOv4

(3) Comparing the improved model with different models;

in order to verify that the method has better effect, a comparison experiment is carried out with other mainstream algorithms under the premise of the same environment and the same data set: YOLOv4-tiny, YOLOv3, SSD, faster R-CNN, YOLOv5. The SSD and YOLO series algorithm is a one-phase network, and the Faster R-CNN is a two-phase network. The results of the specific experiments are shown in table 4. The results in the table 4 show that the method can effectively improve the detection precision of the four foreign matters in the power transmission line. The method has the highest average accuracy of four types of foreign matters, and the mAP with the IOU threshold values of 0.5 and 0.75 is respectively 97.30 percent and 64.56 percent, and is also the highest value. FPS reaches 52.4, and the requirement of real-time detection is met. Meanwhile, the parameter number is only 17% of the original model YOLOv4, and the method is more beneficial to being carried on embedded equipment with limited memory.

TABLE 4 comparison of test results of different algorithms

(4) Comparing detection effects;

in order to compare the actual detection effect of the algorithm, comparison is performed with two models with the highest precision, namely YOLOv4 and YOLOv5. The detection effect is shown in fig. 12. The first, second, third and fourth rows are respectively power line foreign object scenes including balloons, kites, garbage and bird nests, that is, diagrams (a) to (c) are power line foreign object scenes including balloons, diagrams (d) to (f) are power line foreign object scenes of kites, diagrams (g) to (i) are power line foreign object scenes of garbage, diagrams (j) to (l) are power line foreign object scenes of bird nests, and the fifth row (that is, diagrams (m) to (o)) is a power line scene of a supplementary small object data set. The left column (i.e. diagrams (a), (d), (g), (j), (m)) shows the detection effect of the YOLOv4 network, the middle column (i.e. diagrams (b), (e), (h), (k), (n)) shows the detection effect of the YOLOv5 network, and the right column (i.e. diagrams (c), (f), (i), (l), (o)) shows the improvement of the detection effect of the YOLOv4 network. In the detection of the balloon suspended on the first row of transmission line, YOLOv4 is not detected, YOLOv5 and the improved YOLOv4 network proposed by the invention are both detected, and the confidence of the detection of the improved YOLOv4 network is higher. The detection of the kite in the second row is similar to the detection result of the kite in the first row, and the improved YOLOv4 network still obtains the optimal detection result. In the detection of the garbage hung on the third row transmission line, YOLOv4 is not detected, YOLOv5 has false detection, the transmission line is regarded as the garbage, meanwhile, the detection frames of the other two garbage do not well select the full-view frame of the garbage, and the improved YOLOv4 accurately frames the two garbage and has extremely high confidence coefficient. The fourth row is a scene for detecting the bird nest mounted on the tower, YOLOv5 is not detected, YOLOv4 and improved YOLOv4 are both detected and have high confidence, and the improved YOLOv4 selects the bird nest overall view frame more accurately, the YOLOv4 frame is too narrow, and the whole bird nest is not actually contained in the prediction frame. The last row is a data set specially aiming at the supplement of the detection capability of the small targets, the foreign object image of 250 x 250 pixels is embedded into the background image of 5472 x 3078 pixels and is subjected to certain distortion and inversion, the detected targets can appear in the image in an extremely small size, and the detection capability of the network on the small targets in the process of power transmission line inspection is greatly improved. Also the final results show that the highest confidence was achieved with the improved yollov 4. In conclusion, compared with a mainstream high-performance detector, the improved YOLOv4 network model generally obtains the best performance in detecting small and medium targets, has obvious advantages in parameter quantity, can meet the requirement of an actual industrial scene at the detection speed, and has high efficiency capability of being applied to the detection of foreign matters in the scene of the power transmission line.

Claims

1. A foreign matter detection method for a power transmission line based on improved YOLOv4 is characterized by comprising the following steps:

collecting a power transmission line patrol video, performing framing processing on the power transmission line patrol video, and performing data cleaning on a scene picture which is consistent with the condition that the power transmission line is attached with foreign matters;

constructing an improved YOLOv4 network model;

2. The method for detecting the foreign matter in the power transmission line based on the improved YOLOv4 as claimed in claim 1, wherein the method for clearing the scene picture conforming to the condition that the foreign matter is attached to the power transmission line comprises the following steps:

all scene pictures which accord with the condition that the power transmission line is attached with foreign matters are converted into a jpg or png format;

carrying out data enhancement and data set expansion on the scene picture which is subjected to format conversion and accords with the condition that the power transmission line is attached with the foreign matter, wherein the data set expansion comprises a horizontal turning method, a color gamut conversion method, a size scaling method and a mosaic data enhancement method, and a foreign matter image is obtained;

and embedding the foreign matter image into the background image of the power transmission line, wherein the foreign matter image is smaller than the background image of the power transmission line.

3. The method for detecting the foreign matter in the power transmission line based on the improved YOLOv4 as claimed in claim 1, wherein the method for labeling the scene picture which is matched with the foreign matter attached to the power transmission line after the data cleaning comprises:

labeling each picture by using a labellimg labeling tool to form a corresponding xml label file, wherein the format of the xml label file is PASCAL VOC and the xml label file comprises two diagonal coordinates of a rectangular target in the picture and a given category.

4. The method for detecting the foreign matter in the power transmission line based on the improved YOLOv4 as claimed in claim 1, wherein the method for constructing the improved YOLOv4 network model comprises the following steps:

a YOLOv4 network model is taken as a basic model;

improving a feature pyramid pooling SPP module of the YOLOv4 network model into an SPPF module;

applying a path aggregation network and a prediction layer part of a YOLOv4 original algorithm, and replacing a Leaky ReLU activation function in the common convolution with a SiLU activation function;

generating six anchor point frames with small and medium sizes adapting to a foreign body detection data set in a power transmission line scene by using a k-means clustering algorithm, and combining three default anchor point frames with large sizes to form nine final foreign body anchor point frames;

and finally obtaining an improved YOLOv4 network model.

5. A foreign matter detection system of power transmission line based on improved YOLOv4 is characterized by comprising:

and the test module is used for detecting the test set picture by using the stored weight to obtain a detection result of the foreign matter image of the power transmission line.

6. The power transmission line foreign matter detection system based on improved YOLOv4 as claimed in claim 5, wherein the process of data cleaning the scene picture corresponding to the power transmission line with the foreign matter is as follows:

7. The system for detecting foreign matters on power transmission lines based on improved YOLOv4 as claimed in claim 5, wherein the process of labeling the scene pictures which conform to the condition that foreign matters are attached to the power transmission lines after data cleaning comprises:

8. The system for detecting foreign matters in power transmission lines based on improved YOLOv4 as claimed in claim 5, wherein the construction process of the improved YOLOv4 network model is as follows:

a YOLOv4 network model is taken as a basic model;

and finally obtaining an improved YOLOv4 network model.

9. An apparatus, comprising a memory and a processor, wherein:

a memory for storing a computer program capable of running on the processor;

a processor for executing the steps of the method for detecting foreign matters on a power transmission line based on improved YOLOv4 as claimed in any one of claims 1 to 4 when the computer program runs.

10. A storage medium, wherein the storage medium has a computer program stored thereon, and the computer program, when executed by at least one processor, implements the steps of the method for detecting foreign objects on a power transmission line based on modified YOLOv4 as claimed in any one of claims 1 to 4.