CN113378672A - Multi-target detection method for defects of power transmission line based on improved YOLOv3 - Google Patents
Multi-target detection method for defects of power transmission line based on improved YOLOv3 Download PDFInfo
- Publication number
- CN113378672A CN113378672A CN202110600438.3A CN202110600438A CN113378672A CN 113378672 A CN113378672 A CN 113378672A CN 202110600438 A CN202110600438 A CN 202110600438A CN 113378672 A CN113378672 A CN 113378672A
- Authority
- CN
- China
- Prior art keywords
- image
- data set
- target
- target data
- fusion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 32
- 230000005540 biological transmission Effects 0.000 title claims abstract description 23
- 230000007547 defect Effects 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 47
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 30
- 230000009466 transformation Effects 0.000 claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 20
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 238000002372 labelling Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 230000007246 mechanism Effects 0.000 claims abstract description 7
- 238000012216 screening Methods 0.000 claims abstract description 7
- 230000003416 augmentation Effects 0.000 claims abstract description 5
- 238000011426 transformation method Methods 0.000 claims abstract description 4
- 230000003321 amplification Effects 0.000 claims abstract description 3
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 16
- 238000013519 translation Methods 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 11
- 238000010586 diagram Methods 0.000 claims description 10
- 230000006872 improvement Effects 0.000 claims description 9
- 238000013434 data augmentation Methods 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 claims description 7
- 230000001360 synchronised effect Effects 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 6
- 235000005770 birds nest Nutrition 0.000 claims description 5
- 235000005765 wild carrot Nutrition 0.000 claims description 5
- 238000012986 modification Methods 0.000 claims description 4
- 230000004048 modification Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 125000006850 spacer group Chemical group 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 238000007689 inspection Methods 0.000 description 10
- 238000013135 deep learning Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 239000012212 insulator Substances 0.000 description 2
- 241001270131 Agaricus moelleri Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007797 corrosion Effects 0.000 description 1
- 238000005260 corrosion Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005562 fading Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration using histogram techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-target detection method for defects of a power transmission line based on improved YOLOv3, which comprises the following steps: step one, screening a data set image, screening an original image, and selecting a target image meeting requirements; step two, carrying out image augmentation on the images obtained in the step one to obtain a target data set; after the data amplification is completed, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method; fourthly, sorting and labeling the target data set preprocessed in the third step to obtain a target data set; step five, improving YOLOv3 by 'combining' the feature attention mechanism and the fusion; and step six, training the target data set in the improved algorithm to detect pictures.
Description
Technical Field
The invention relates to the technical field of target detection and identification, in particular to a power transmission line defect multi-target detection method based on improved YOLOv 3.
Background
The transmission line is divided into an overhead transmission line and a cable line, the overhead transmission line is composed of a line tower, a conducting wire, a line fitting, an insulator, a stay wire, a grounding device and the like, is widely distributed and is distributed in various terrains such as fields, urban areas, deserts, lakes and the like. Because the long-term operation is in the field, experiences the impact of extreme weather such as stormy wind, storm and insolation, parts such as wire, gold utensil, insulator appear defects such as corrosion, damage, disconnected strand easily, simultaneously, the part installation is not standard also brings the hidden danger for transmission line safe operation.
Along with the development and the application of transmission line combination unmanned aerial vehicle target defect identification scheme, the picture data volume that patrols and examines the mode and acquire through unmanned aerial vehicle is exponential type and increases, and traditional artifical mode drawback of patrolling and examining shows gradually. And the computer is used for carrying out intelligent defect identification on the inspection picture, so that the requirement on professional quality of personnel is further improved. At present, the proportion of unmanned aerial vehicles in the electric power inspection operation is higher and higher, and along with unmanned aerial vehicle inspection is more and more intelligent, automatic, but future electric power inspection development direction should realize that unmanned aerial vehicle inspection operation scene covers entirely.
The Power computer vision (Power CV) is a sub-field of Power artificial intelligence, which solves the visual problem in each link of a Power system by utilizing the technologies of machine learning, pattern recognition, digital image processing and the like and combining with the knowledge in the Power professional field, and relates to each link of 'transmission and transformation' of the whole Power system. Various camera supervisory equipment of circuit installation utilizes unmanned aerial vehicle to patrol and examine work, and the content of patrolling and examining the circuit is shot, produces a large amount of videos and images, need combine the relevant knowledge of electric power system, just can be better carry out analysis processes to it. In the aspect of automatic identification of defects of massive images, because the images shot by the power transmission line have obvious multi-scale structural characteristics, on one hand, the background of the images shot by the close-distance unmanned aerial vehicle is complex, and higher misjudgment can be caused by the influence of light; on the other hand, when the unmanned aerial vehicle shoots at different shooting angles, a large number of shielding situations can exist, and the separation of the local outline structure is a difficult task.
Helicopter inspection methods initially utilized a super-red method to identify rusty parts using least squares fitting and geometric features on images taken artificially by helicopters in the air. However, the method has limited recognition accuracy and slow detection speed. And later, a helicopter is used for carrying a real-time infrared video sequence shot by the thermal infrared imager, and the defective area in the image is determined by using methods such as Hough transformation, an Otsu adaptive threshold algorithm, SIFT feature matching and the like. With the continuous promotion of science and technology, the helicopter patrols and examines this kind of semi-artificial mode of patrolling and examining and can not satisfy smart power grids development demand.
In recent years, by means of a new-generation artificial intelligence technology represented by deep learning, an inspection image defect identification algorithm is continuously innovated and is gradually applied to an unmanned aerial vehicle intelligent inspection project of an overhead transmission line. With the application of object detection algorithms based on CNN, such as RCNN, Fater-RCNN and YOLO, becoming more mature and the further improvement of hardware operation level, the object detection algorithms also play a unique advantage in the field of power computer vision. An improved Fater-RCNN algorithm is proposed, a self-built equipment sample library is used for model training, and target detection is carried out on an electric power inspection image, so that the detection precision and the detection speed of a model are improved, but the identification accuracy of small targets is not high, and the instantaneity cannot be guaranteed.
Compared with the method, the one-stage target detection algorithm represented by the YOLO algorithm based on the Convolutional Neural Network (CNN) in deep learning has a detection speed obviously higher than that of the fwo-stage target detection algorithm based on roi (region of interest) such as the far-RCNN on the premise of keeping high identification accuracy, and can meet the real-time requirement of the system, so that the method is more suitable for application in an industrial field.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a multi-target detection method for the defects of the power transmission line based on improved YOLOv3, the original FPN characteristic Fusion mode is improved by adopting an Attention mechanism-Fusion (Attention-Fusion) mode, a cosine learning rate, a synchronous normalization technology and other special neural network training skills are used in training, on the premise of not changing a neural network architecture, extra reasoning and calculation cost is not introduced, and the performance of YOLOv3 is obviously improved; and training and learning the self-made data set by using an improved algorithm so as to realize multi-target identification of the defects of the power transmission line.
The purpose of the invention is realized as follows: a multi-target detection method for defects of power transmission lines based on improved YOLOv3 comprises the following steps:
the method comprises the following steps: screening a data set image, carrying out purposeful screening on an obtained original image, wherein the image at least comprises one of six types of ground wires, vibration dampers, bird nests, signboards, ground wire clamps and spacing rods, and initially selecting a target image meeting requirements;
step two: carrying out image augmentation on the image obtained in the step one, processing the screened image in a data augmentation mode, wherein the data augmentation mode comprises translation, rotation, overturning, scaling and cutting and rotation and translation combined transformation, and randomly selecting a translation distance and a rotation angle in the processing process to obtain a target data set;
step three: after the data amplification is completed, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method;
step four: sorting and labeling the target data sets preprocessed in the step three, modifying the picture names in batches, and labeling the target data sets in batches;
step five: improving YOLOv3 by combining the characteristic attention mechanism and the fusion to obtain an improved algorithm;
step six: and training in the improved algorithm by using the marked target data set, and finishing the detection of the picture to be detected.
Preferably, the algorithm improvement in the step five specifically includes:
note that the force mechanism-fusion, for any given transformation, the input feature maps X1 and X2 were subjected to 1X 1 convolutions, respectively, resulting in T1 and T2, where,representing a space structure for a scale space, H represents the height of the feature map, W represents the width of the feature map, and C1 and C2 represent the channel numbers;
the T1 and T2 features are transmitted to a maximum average pooling operation, the features are compressed to H multiplied by W space dimension, the features at the moment become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels, such as the following two formulas:
Then, performing full-connection layer operation, namely replacing the full-connection operation in the traditional sense with convolution with the convolution kernel size of 1 multiplied by 1 and the step length of 1, and obtaining S1 and S2 after the full-connection operation;
adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
P is subjected to Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, and then each channel is weighted to X1 and X2 characteristics through matrix operation, so that the original characteristics are re-calibrated and fused in channel dimension, and a new characteristic Y is obtained; the process is as follows:
Y=(X1+X2)*Sigmoid(P)。
preferably, in the sixth step, a cosine learning rate and a synchronous normalization technology are used for processing in the training process;
the cosine learning rate processing specifically comprises:
when a gradient descent algorithm is used for optimizing the target function, a cosine function is used for reducing the learning rate in a matching way, and the change rule of the learning rate along with the iteration times is shown as the following two formulas:
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number; the following modifications are made in the training process:
in actual training, the TotalIoperation and initialization T of the optimizer are reset when the corresponding epoch is in turncurAnd (4) finishing.
Compared with the prior art, the invention has the advantages that:
1. by means of image augmentation, the number of data set samples is increased, the occurrence of network overfitting can be reduced, and the robustness and detection precision of a detection algorithm are improved;
2. carrying out image preprocessing on part of sample pictures, wherein a gray scale interval of an interest region in the pictures is highlighted by a piecewise linear transformation gray scale transformation method; the histogram equalization method solves the problems of overexposure or underexposure in the picture; homomorphic filtering eliminates the problem of uneven illumination in the picture; smooth denoising eliminates image noise caused by external factors; the four methods are combined to enable the image to be clearer and have more obvious characteristics, so that the use value of the image is improved; the characteristics of the detected small target in the background are obviously enhanced, the characteristics of the detected small target are clearer, and the detection precision of the small target is improved
3. The cosine learning rate and the synchronous normalized neural network training skill are used, and the cosine function is used for reducing the learning rate in a matching way, so that the learning rate is closer to the global minimum value of Loss; the problem that a BN layer fails during multi-display card training is solved by using a synchronous normalization method; after improvement, the feature extraction capability of the network is obviously improved, and the detection result is enhanced; and meanwhile, the network training time is reduced.
4. The Attention-Fusion mode is used for replacing the Concat Fusion mode to improve the original feature Fusion mode, the relation among feature channels is established by means of the Attention mechanism idea, the non-linear capability of the network is further improved, key information is highlighted, irrelevant information is restrained, information redundancy is reduced, the feature expression capability of the fused feature graph is further enhanced, and the problem caused by sample overlapping is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a development environment configuration of the present invention.
Fig. 2 is a network part parameter set of the present invention.
FIG. 3 is a flow chart of the multi-target detection method for power transmission line defects.
FIG. 4 is an exemplary graph of a data set.
Fig. 5 is a diagram illustrating an example of the manner in which image data is augmented.
Fig. 6 is a diagram of the result of piecewise linear transformation in image pre-processing.
Fig. 7 is a diagram showing the result of histogram equalization in image preprocessing.
Fig. 8 is a diagram showing the result of the smoothing processing in the image preprocessing.
Fig. 9 is a diagram of an example of a data set category.
FIG. 10 is a LabelImg operating interface diagram.
FIG. 11 is a diagram of data set annotation results.
Fig. 12 is a structural view of a modified YOLOv 3.
FIG. 13 is a diagram of an attention mechanism fusion architecture for algorithm improvement.
Fig. 14 is an identification case of different algorithms.
FIG. 15 shows the detection results of multiple types of defect targets in the power transmission line.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The deep learning training platform used by the invention is configured as shown in fig. 1, and the network part parameter setting is shown in fig. 2.
FIG. 3 is a flow chart of the multi-target detection method for defects of power transmission lines, which comprises the following steps:
the method comprises the following steps: data set image screening
The researched power transmission line data set comprises six samples of a spacer, a shockproof hammer, a bird nest, a signboard, a ground wire and a ground wire clamp. The six types of components are important components of the power transmission line and are also an extremely important ring in power inspection. All images are screened and a picture containing these six types of components is selected as the data set, as shown in fig. 4.
Step two: image augmentation
By means of five processing modes of moving, rotating, overturning, scaling and cutting and rotation-translation combined transformation of a sample picture by using an MATLAB software platform, the translation distance and the rotation angle are randomly selected in the processing process, the diversity of samples is increased, a large batch of sample data sets are obtained, and an image data expansion scheme is shown in fig. 5.
The data augmentation method used is performed by taking the image as a center point by default in actual operation. From the mathematical point of view, the method can be divided into the following steps:
1. moving the rotation point to the origin;
2. rotating around the origin;
3. and then the rotation point is moved back to the original position.
Assume the original coordinates of the image as x0,y0,1]TAnd the coordinates after translation are [ x, y, 1]]TAnd then the coordinate relationship before and after translation is as follows:
the image translation refers to the translation sum of all pixels in the x and y directions, and the mathematical matrix corresponding to the translation is:
wherein d isx,dyAnd respectively indicate the distance moved in the horizontal and vertical directions.
The image rotation is mainly to rotate by any angle through a specified rotation center point (default is the image center point), and the mathematical matrix is expressed as:
where θ is the angle of rotation (in the non-radian scale).
The image flipping includes horizontal flipping and vertical flipping, the mathematical matrix for the horizontal flipping is represented as:
the vertically flipped mathematical matrix is represented as:
in the deep learning task, a common method for clipping an image is to scale an original image by a certain time (1.1 times in the present system) of the original image, and then perform a clipping operation on the scaled image, where a scaling mathematical matrix is expressed as:
in the deep learning task, data augmentation generally adopts a data augmentation mode of various combinations, and the results of different combination sequences are different as known from matrix operation knowledge. To explain this process more intuitively, assume the translation transformation matrix is HshiftRotation transformation matrix is Hrotate. Mainly using translational-rotational combined data augmentation, there are two different combined transformations.
First, translation is performed before rotation, and then the transformation result mathematical matrix can be expressed as:
secondly, firstly rotating and then translating, the transformation result mathematical matrix can be expressed as:
step three: image pre-processing
1) Gray scale conversion method using piecewise linear transformation
The processing results are shown in fig. 6, which highlights the gray scale regions of the region of interest and relatively suppresses those gray scale regions that are not of interest.
The mathematical expression of piecewise linear transformation is:
wherein the gray scale interval [ a, b ] is adjusted]Linear stretching is performed to obtain a gray scale interval [0, a ]]And [ b, fmax]Is compressed.
2) Equalization method using histogram
As shown in fig. 7, the histogram equalization method can make the gray scales of the image uniformly distributed or the gray scale intervals apart, thereby achieving the purpose of increasing the contrast and making the picture clear.
Assuming that the gray scale of the original image at (x, y) is f, the value range is [0, L-1], when f is 0, the color is black, when f is L-1, the color is white, and the gray scale after equalization is j, the transformation process can be described as follows:
j(x,y)=T[f(x,y)],0≤f≤L-1
where the transformation T needs to satisfy the condition: t (r) strictly increases over the gray scale interval [0, L-1 ]; when f is more than or equal to 0 and less than or equal to L-1, T is more than or equal to 0 and less than or equal to (r) and less than or equal to L-1, wherein L is less than or equal to 256.
The Cumulative Distribution Function (CDF) satisfies exactly the above two conditions, and its mathematical expression is:
where ω is a formal integral variable;
for finding the probability density function p of the transformed random variable ss(s):
Further obtain the
The image equalization transformation T (r) depends on pr(r) but ps(s) always satisfying a uniform distribution, and prThe form of (r) has no correlation. Since the image pixel distribution is discrete, the discrete form expression of the cumulative distribution function is:
wherein k is more than or equal to 0 and less than or equal to L-1, MN is the total number of image pixels, nkRepresenting a gray scale of rkThe gray level value s (k) of each pixel after equalization can be directly calculated from the histogram of the original image.
3) Method for using homomorphic filtering
In the shooting process, the gray level dynamic range of one type of image is large due to uneven illumination of parts, black and white form strong contrast, details are not clearly seen, and the problems cannot be solved by adopting the general piecewise gray linear transformation. And homomorphic filtering can eliminate the adverse effect caused by uneven illumination and enhance the image details.
4) Processing images using smooth denoising
Considering the complexity of detecting the field environment and the image noise introduced in the image acquisition process, the quality of the image can be seriously affected, and an appropriate method needs to be used for eliminating the influence. After investigation, most of noise belongs to random signals, the influence on the image is independent, smooth denoising processing is performed on the image by using low-pass filtering, and the processing effect is shown in fig. 8.
Assuming that the pixel to be processed is f (x, y) and the processed image is g (x, y), the smoothing process can be described as follows:
in the formula, T is more than or equal to 0, and Q is the number of pixels in the neighborhood S.
Step four: data set sorting and labeling
1) And compiling Python scripts to modify the picture names in batches, wherein six digits (000000-999999) are used.
2) The data set is divided into two categories of target detection and fault foreign matter identification, including six categories of ground wires, vibration dampers, bird nests, signboards, ground wire clamps and spacers, as shown in fig. 9. And randomly dividing the training set, the testing set and the verification set according to the proportion.
3) The labeling dataset was batched using the LabelImg labeling tool, as shown in FIGS. 10 and 11. Generating an XML file;
4) and arranging the data set and the XML file, and packaging the data set and the XML file into a data set folder.
Step five: algorithm improvement
The improved YOLOv3 structure is shown in FIG. 12
Attention-Fusion (Attention-Fusion) method:
Attention-Fusion alleviates the inconsistency problem by creating a mechanism to enhance the connection between different feature maps, the structure of which is shown in fig. 13.
Unlike the feature fusion method of adding element by element and adding line by line, the key idea of the invention is to use attention mechanism to establish the relation between feature channels. It comprises two main steps: feature attention extraction and feature fusion.
The method aims to achieve the purpose of improving the network expression capacity by modeling the interdependence relationship among channels of convolution characteristics of different characteristic graphs, learn and utilize global information among the different characteristic graphs, selectively emphasize key information and inhibit useless information.
Attention-Fusion for any given transform, the input feature maps X1 and X2 were subjected to 1 × 1 convolution, respectively, resulting in T1 and T2. Wherein
And (3) transmitting the T1 and T2 features to a maximum average pooling operation, compressing the features to H multiplied by W space dimension, wherein the features become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels. The following two equations:
This is followed by a full join layer operation where the convolution with a convolution kernel size of 1 x1 and step size of 1 is still used instead of the full join operation in the conventional sense to reduce information redundancy and computational load. After full ligation, S1, S2 were obtained.
Adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
P passes through a Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, and then each channel is weighted to X1 and X2 characteristics through matrix operation, so that the original characteristics are recalibrated and fused in channel dimension. The process is as follows:
Y=(X1+X2)*Sigmoid(P);
step six: model training improvements
1) Cosine learning rate
When the objective function is optimized by using a Gradient Descent (Gradient decision) algorithm, a cosine function is used to reduce the learning rate in a matching way. The change rule of the learning rate along with the iteration number is shown as the following two formulas:
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number.
For the convenience of implementation, the invention is modified as follows:
thus, in the actual training, the TotalIoperation and initialization T of the optimizer are reset (reset) when the corresponding epoch is in turncurAnd (4) finishing.
2) The synchronous normalization technology is synchronous normalization, namely BN parameters are fused into a Conv layer, and the principle is as follows:
yBN=Wmergex+bmerge
wherein WmergeIs the weight after fusion, W is the weight before fusion, Var [ x ]]For the variance of the input features x, E [ x]As a statistical mean of the data set of the input features x, bmergeThe bias after fusion, b the bias before fusion, gamma, epsilon and beta the learning parameters, yBNIs the fused output.
The designed algorithm was experimentally tested as follows.
In order to verify whether the improved algorithm is real and effective and whether the expected purpose is achieved, a general target detection data set VOC2014 data set is used firstly, and the improved algorithm is verified under the condition that the experimental environment is consistent.
FIG. 14 is a comparison of recognition situations of different algorithms, and it can be seen from the figure that the improved algorithm provided by the invention is superior to other classical target detection algorithms in recognition accuracy, and the detected mAP is 81.6%.
The partial picture detection results are shown in fig. 15. The algorithm has good identification effect on bird nests, spacing rods, loose strands of grounding wires and fading of rod number plates, but the omission of the vibration dampers is easy to occur because the background color is darker and the target color is close to the background, so that the characteristics are not obvious.
And because no exact standard is available at present to distinguish the accurate relation between the normal state and the slippage of the shockproof hammer, the defect is doubtful to be marked, so that the slippage defect of the shockproof hammer is not paid much attention.
The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
Claims (3)
1. A multi-target detection method for defects of a power transmission line based on improved YOLOv3 is characterized by comprising the following steps:
the method comprises the following steps: screening a data set image, performing purposeful screening on an obtained original image, wherein the image at least comprises one of six types of ground wires, a shockproof hammer, a bird nest, a signboard, a ground wire clamp and a spacer, and preliminarily selecting a target image meeting requirements;
step two: carrying out image augmentation on the image obtained in the step one, processing the screened image in a data augmentation mode, wherein the data augmentation mode comprises translation, rotation, overturning, scaling and cutting and rotation and translation combined transformation, and randomly selecting a translation distance and a rotation angle in the processing process to obtain a target data set;
step three: after the data amplification is finished, image preprocessing operation needs to be carried out on partial photos of the target data set, and the images are processed by using a piecewise linear transformation gray level transformation method, a histogram equalization method, a homomorphic filtering method and a smooth denoising method;
step four: sorting and labeling the target data sets preprocessed in the step three, modifying picture names in batches, and labeling the target data sets in batches;
step five: improving YOLOv3 by combining the characteristic attention mechanism and the fusion to obtain an improved algorithm;
step six: and training in the improved algorithm by using the previously marked target data set, and finishing the detection of the picture to be detected.
2. The multi-target detection method for the defects of the power transmission lines based on the improved YOLOv3 as claimed in claim 1, wherein the algorithm improvement in the step five specifically comprises:
note that the force mechanism-fusion, for any given transformation, the input feature maps X1 and X2 were subjected to 1X 1 convolutions, respectively, resulting in T1 and T2, where, representing a space structure for scale space, wherein H represents the height of the characteristic diagram, W represents the width of the characteristic diagram, and C1 and C2 represent the channel number;
the T1 and T2 features are transmitted to a maximum average pooling operation, the features are compressed to H multiplied by W space dimension, the features at the moment become vectors with global receptive fields in a certain sense, and the output dimension is matched with the number of input feature channels, such as the following two formulas:
Then, performing full-connection layer operation, namely replacing the full-connection operation in the traditional sense with convolution with the convolution kernel size of 1 multiplied by 1 and the step length of 1, and obtaining S1 and S2 after the full-connection operation;
adding S1 and S2 to obtain P, and then re-aggregating original features on channel dimension, as shown in the formula:
P=S1+S2#(4.19)
P is subjected to Sigmoid function, the output weight is regarded as the importance of each fusion characteristic channel, each channel is weighted to X1 and X2 characteristics through matrix operation, the re-calibration fusion of the original characteristics in channel dimension is realized, and a new characteristic Y is obtained; the process is as follows:
Y=(X1+X2)*Sigmoid(P)。
3. the multi-target detection method for the defects of the power transmission line based on the improved YOLOv3 as claimed in claim 1, wherein in the sixth step, a cosine learning rate and a synchronous normalization technology are used for processing in the training process;
the cosine learning rate processing specifically comprises:
when a gradient descent algorithm is used for optimizing the target function, a cosine function is used for reducing the learning rate in a matching way, and the change rule of the learning rate along with the iteration times is shown as the following two formulas:
wherein etamin,ηmaxExpressed as the range of learning rates, TcurIndicates how many epochs, T, are currently executedmaxExpressed as the total epoch number; the following modifications are made in the training process:
in actual training, the TotalIoperation and initialization T of the optimizer are reset when the corresponding epoch is in turncurAnd (4) finishing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110600438.3A CN113378672A (en) | 2021-05-31 | 2021-05-31 | Multi-target detection method for defects of power transmission line based on improved YOLOv3 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110600438.3A CN113378672A (en) | 2021-05-31 | 2021-05-31 | Multi-target detection method for defects of power transmission line based on improved YOLOv3 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113378672A true CN113378672A (en) | 2021-09-10 |
Family
ID=77575009
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110600438.3A Pending CN113378672A (en) | 2021-05-31 | 2021-05-31 | Multi-target detection method for defects of power transmission line based on improved YOLOv3 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113378672A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870265A (en) * | 2021-12-03 | 2021-12-31 | 绵阳职业技术学院 | Industrial part surface defect detection method |
CN114913473A (en) * | 2022-03-21 | 2022-08-16 | 中国科学院光电技术研究所 | Lightweight single-body imaging contact network safety patrol instrument |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160142747A1 (en) * | 2014-11-17 | 2016-05-19 | TCL Research America Inc. | Method and system for inserting contents into video presentations |
CN107330871A (en) * | 2017-06-29 | 2017-11-07 | 西安工程大学 | The image enchancing method of insulator automatic identification is run under bad weather condition |
CN108257090A (en) * | 2018-01-12 | 2018-07-06 | 北京航空航天大学 | A kind of high-dynamics image joining method that camera is swept towards airborne row |
CN110599445A (en) * | 2019-07-24 | 2019-12-20 | 安徽南瑞继远电网技术有限公司 | Target robust detection and defect identification method and device for power grid nut and pin |
CN111681240A (en) * | 2020-07-07 | 2020-09-18 | 福州大学 | Bridge surface crack detection method based on YOLO v3 and attention mechanism |
CN112464910A (en) * | 2020-12-18 | 2021-03-09 | 杭州电子科技大学 | Traffic sign identification method based on YOLO v4-tiny |
CN112508014A (en) * | 2020-12-04 | 2021-03-16 | 东南大学 | Improved YOLOv3 target detection method based on attention mechanism |
-
2021
- 2021-05-31 CN CN202110600438.3A patent/CN113378672A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160142747A1 (en) * | 2014-11-17 | 2016-05-19 | TCL Research America Inc. | Method and system for inserting contents into video presentations |
CN107330871A (en) * | 2017-06-29 | 2017-11-07 | 西安工程大学 | The image enchancing method of insulator automatic identification is run under bad weather condition |
CN108257090A (en) * | 2018-01-12 | 2018-07-06 | 北京航空航天大学 | A kind of high-dynamics image joining method that camera is swept towards airborne row |
CN110599445A (en) * | 2019-07-24 | 2019-12-20 | 安徽南瑞继远电网技术有限公司 | Target robust detection and defect identification method and device for power grid nut and pin |
CN111681240A (en) * | 2020-07-07 | 2020-09-18 | 福州大学 | Bridge surface crack detection method based on YOLO v3 and attention mechanism |
CN112508014A (en) * | 2020-12-04 | 2021-03-16 | 东南大学 | Improved YOLOv3 target detection method based on attention mechanism |
CN112464910A (en) * | 2020-12-18 | 2021-03-09 | 杭州电子科技大学 | Traffic sign identification method based on YOLO v4-tiny |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870265A (en) * | 2021-12-03 | 2021-12-31 | 绵阳职业技术学院 | Industrial part surface defect detection method |
CN114913473A (en) * | 2022-03-21 | 2022-08-16 | 中国科学院光电技术研究所 | Lightweight single-body imaging contact network safety patrol instrument |
CN114913473B (en) * | 2022-03-21 | 2023-08-15 | 中国科学院光电技术研究所 | Lightweight monomer type imaging contact net safety inspection instrument |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428748B (en) | HOG feature and SVM-based infrared image insulator identification detection method | |
CN108961235B (en) | Defective insulator identification method based on YOLOv3 network and particle filter algorithm | |
CN111784633B (en) | Insulator defect automatic detection algorithm for electric power inspection video | |
CN112733950A (en) | Power equipment fault diagnosis method based on combination of image fusion and target detection | |
CN109034184B (en) | Grading ring detection and identification method based on deep learning | |
CN109376591B (en) | Ship target detection method for deep learning feature and visual feature combined training | |
CN111950453A (en) | Optional-shape text recognition method based on selective attention mechanism | |
CN111242868B (en) | Image enhancement method based on convolutional neural network in scotopic vision environment | |
CN113378672A (en) | Multi-target detection method for defects of power transmission line based on improved YOLOv3 | |
CN115690542A (en) | Improved yolov 5-based aerial insulator directional identification method | |
CN114419413A (en) | Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network | |
CN111915558A (en) | Pin state detection method for high-voltage transmission line | |
CN112884795A (en) | Power transmission line inspection foreground and background segmentation method based on multi-feature significance fusion | |
CN117409083B (en) | Cable terminal identification method and device based on infrared image and improved YOLOV5 | |
CN117994573A (en) | Infrared dim target detection method based on superpixel and deformable convolution | |
CN116485802B (en) | Insulator flashover defect detection method, device, equipment and storage medium | |
CN111881803B (en) | Face recognition method based on improved YOLOv3 | |
CN117541535A (en) | Power transmission line inspection image detection method based on deep convolutional neural network | |
CN112465736B (en) | Infrared video image enhancement method for port ship monitoring | |
CN111126173A (en) | High-precision face detection method | |
CN110033037A (en) | A kind of recognition methods of digital instrument reading | |
CN116189160A (en) | Infrared dim target detection method based on local contrast mechanism | |
CN111402223B (en) | Transformer substation defect problem detection method using transformer substation video image | |
Yu et al. | Safety Helmet Wearing Detection Based on Super-resolution Reconstruction | |
CN112950620A (en) | Power transmission line damper deformation defect detection method based on cascade R-CNN algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210910 |