CN111402248A

CN111402248A - Transmission line lead defect detection method based on machine vision

Info

Publication number: CN111402248A
Application number: CN202010207255.0A
Authority: CN
Inventors: 杜启亮; 黎春翔; 田联房; 邝东海
Original assignee: South China University of Technology SCUT; Zhuhai Institute of Modern Industrial Innovation of South China University of Technology
Current assignee: South China University of Technology SCUT; Zhuhai Institute of Modern Industrial Innovation of South China University of Technology
Priority date: 2020-03-23
Filing date: 2020-03-23
Publication date: 2020-07-10
Anticipated expiration: 2040-03-23
Also published as: CN111402248B

Abstract

The invention discloses a machine vision-based method for detecting defects of a wire of a power transmission line, which comprises the following steps: acquiring field image data and making a training data set; constructing and training an example segmentation network to obtain a prediction model; the model deduces an input picture to obtain a rectangular area image and a binary mask image of the lead; extracting a lead framework by adopting a framework algorithm, calculating the average width of the lead, and reconstructing a binary mask image; eliminating the influence of uneven illumination of the rectangular area image by adopting a homomorphic filtering algorithm, and extracting a segmented wire area image by combining a reconstructed binary mask image; generating a large number of rectangular frames on the wire area for screening; making a classification training data set, and constructing and training a shallow classification network to obtain a classification prediction model; and inputting the area pictures of the wire sections into a classification prediction model, and counting the defect types and defect proportions of the wire sections. The invention can accurately divide the lead, detect the state of the lead in sections and judge the defect type and the defect degree of the lead.

Description

Transmission line lead defect detection method based on machine vision

Technical Field

The invention relates to the technical field of transmission line lead defect detection, in particular to a transmission line lead defect detection method based on machine vision.

Background

The safety of the transmission line conductor is related to whether the electric power can be normally transmitted, and the conductor is equivalent to the blood vessel of a power grid system and is used for transmitting and distributing the electric power. However, in an outdoor scene, the lead is easily affected by corrosion and external force damage, and has the defects of corrosion, abrasion, strand breakage and the like, which causes great hidden danger to the normal operation of the power transmission line. Some current methods for detecting the guide line are mainly based on Hough transformation and morphological image processing methods, and on one hand, the method is difficult to have universality for variable shooting angles and scenes in outdoor scenes, and on the other hand, the method does not have the capability of dividing and detecting the non-linear guide line.

The method aims to provide a wire defect detection method based on machine vision, and the method adopts a deep learning network to segment outdoor power transmission line scene pictures shot by an unmanned aerial vehicle, roughly positions the region of a wire and a binary mask, cuts the region of the wire in sections, and identifies the state of each section through a shallow convolutional neural network. Therefore, the conducting wire can be effectively segmented and accurately detected in a segmented manner under complex and variable scenes.

In combination with the above discussion, the method for detecting the defects of the conducting wire based on the machine vision has higher practical application value.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, and provides a machine vision-based method for detecting the defects of the conducting wire of the power transmission line, which can accurately divide the conducting wire, detect the state of the conducting wire in sections and judge the defect type and defect degree of the conducting wire.

In order to achieve the purpose, the technical scheme provided by the invention is as follows: a transmission line lead defect detection method based on machine vision comprises the following steps:

1) acquiring field image data and making a training data set;

2) constructing and training an example segmentation network to obtain a prediction model;

3) the model deduces an input picture to obtain a rectangular area image and a binary mask image of the lead;

4) extracting a lead framework by adopting a framework algorithm to obtain a binary framework image of the lead, calculating the average width of the lead, and reconstructing a binary mask image of a lead region;

5) eliminating the influence of uneven illumination of the rectangular area image by adopting a homomorphic filtering algorithm, and extracting a segmented wire area image by combining a reconstructed binary mask image;

6) generating a large number of rectangular frames on the lead area, screening, and removing the rectangular frames with high overlapping degree;

7) cutting a rectangular frame, carrying out scaling treatment, making a classification training data set, constructing and training a shallow classification network, and obtaining a classification prediction model;

8) and inputting the conductor segment area picture obtained by the test data into a classification prediction model to obtain the state of the conductor segment, and counting the defect type and the defect proportion of the state of the conductor segment.

In the step 1), the unmanned aerial vehicle is used for carrying out route inspection on the power transmission line, images on the power transmission line are collected and transmitted to a remote server end through a network, wherein the obtained field data are divided into training data and test data according to the proportion of 7:3, L abelme software is used for carrying out wire point set marking on the training data, and a training data set of a wire example segmentation task is constructed.

In the step 2), conducting wire segmentation is carried out by adopting a multitask Mask R-CNN network, a Mask R-CNN network is constructed by adopting a pytorch, and the network is trained to obtain a prediction model; the Mask R-CNN network mainly comprises a base network, a regional suggestion network RPN, a RoIAlign module, a classification branch, a coordinate regression branch and a Mask branch; the Mask R-CNN network derivation comprises the following steps:

2.1) firstly extracting features of an input image through a base network to obtain feature maps with different scales;

2.2) RPN carries out regional suggestion, each point on the characteristic diagram generates a rectangular frame with different dimensions, coarse classification and coarse positioning are carried out through a network, a large number of rectangular frames are screened out based on confidence coefficient and non-maximum inhibition thought, and the rest rectangular frames are sent to a subsequent network;

2.3) outputting the feature map areas where the rectangular frames with different sizes and scales are located through a RoIAlign module to obtain a feature map with a fixed size, dividing the rectangular frame into a plurality of fixed units by the RoIAlign module, calculating fixed four coordinate positions in each unit without quantizing the boundary of each unit, calculating the values of the four positions by adopting a bilinear interpolation method, and performing maximum pooling operation based on the values of the four positions;

2.4) taking the feature map with fixed size as the input of a classification branch, a coordinate regression branch and a Mask branch; the classification branch outputs the feature map category in a thermal coding mode, the coordinate regression branch is used for predicting the coordinate and width-height deviation value of the rectangular frame and the real target area, and the Mask branch outputs the binary Mask image of the target expressed by the values of 0 and 1;

in step 3), writing the weight of the prediction model into a network, inputting a test picture for forward calculation, obtaining the category confidence of the lead from the classification branch in the output end of the network, obtaining the rectangular frame coordinate of the lead from the coordinate regression branch, and obtaining the binarization Mask image of the lead from the Mask branch, thereby obtaining the rectangular region image and the binarization Mask image of the lead through segmentation.

In the step 4), performing skeletonization processing on the binary mask image of the wire by adopting a Zhang-Zu skeletonization algorithm to obtain a binary skeleton map of the wire, counting the number of 1 s in the binary mask and the number of 1 s in the skeleton map, calculating the average width of the wire, performing proper adjustment based on the width, and drawing a circle on each point on the binary skeleton map by taking half of the average width of the wire as a radius to obtain a wire binary mask image with uniform width.

In the step 5), a homomorphic filtering algorithm is adopted to perform data enhancement on the image of the rectangular area of the wire, namely, the pixel gray value is regarded as the combination of illumination and reflectivity, wherein the illumination is the low-frequency component of the image, the reflectivity is the high-frequency component of the image, and the influence of the illumination and the reflectivity on the pixel gray value is processed in the frequency domain respectively, so that the dark area expression of the image is enhanced, and the influence of uneven illumination on the image expression is reduced.

In step 6), a square rectangular frame with the side length being the width of the conducting wire is generated for each point in the binary skeleton diagram, and then the rectangular frame is screened based on the non-maximum inhibition idea, wherein the screening step is as follows:

6.1) sorting all the rectangular frames according to the size of the abscissa of the central coordinate of the rectangular frames;

6.2) taking the first rectangular frame coordinate, respectively calculating the intersection ratio of the first rectangular frame coordinate to the rest rectangular frame coordinates, eliminating the rectangular frames with the intersection ratio larger than 0.7, only keeping the rectangular frame coordinates with the intersection ratio smaller than 0.7, and marking the rectangular frames in a traversed state;

6.3) sequentially removing the next rectangular frame, and if the next rectangular frame is removed, sequentially extending the next rectangular frame;

6.4) repeating the steps 6.2) and 6.3) until all the rectangular frame coordinates are traversed or eliminated, and obtaining the rectangular frame coordinates with small overlapping degree after screening.

In step 7), all the rectangular box coordinates in step 6) are cut and scaled to a fixed size of 32 × 32, a training data set and a testing data set are divided according to a ratio of 8:2, the training data set is classified into five types including normal, background, corrosion, damage and stock breakage, a shallow convolutional neural network is constructed by adopting a pytorch and is used as a classification network, cross entropy is used as a loss function to supervise network training, and the training data set is input into the classification network to obtain a classification prediction model.

In step 8), the conductor segment region picture obtained by the test data is input into a prediction classification model for forward calculation, a state label of each conductor segment region is obtained at the output end of the network, the defect type of the whole conductor is counted based on the state label, namely the defect type comprises the category set of all the conductor region labels, and the defect proportion is counted, namely the proportion of the number of the rectangular frames with defects in the number of all the rectangular frames is calculated.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. the deep learning example segmentation algorithm is adopted to segment the conducting wire, so that the robustness of the algorithm is improved, and the algorithm is guaranteed to have a good performance effect in both miscellaneous and variable scenes.

2. The method can remove the defect of irregular mask caused by dividing the wires by an example division algorithm, and ensures that the regions of the wires can be accurately extracted.

3. Local images are subjected to data enhancement by adopting homomorphic filtering, the influence of illumination, shadow and imaging on the state classification of the wire sections is reduced, the processing time of the algorithm is shortened by local image data enhancement, and the practicability of the algorithm is improved.

4. The shallow convolutional neural network is adopted to classify the states of the lead segments, so that high-precision state identification and defect positioning can be realized, and the accuracy and robustness of the algorithm are improved.

Drawings

FIG. 1 is a logic flow diagram of the present invention.

Fig. 2 is a power transmission line image acquired by the unmanned aerial vehicle of the present invention.

FIG. 3 is a network structure diagram of Mask R-CNN.

Fig. 4 is a diagram of a ResNet-50 network architecture.

Fig. 5 is a diagram of network module a.

FIG. 6 is an ID block structure diagram.

FIG. 7 is a conv block structure diagram.

FIG. 8 is a diagram of Mask prediction branch network structure

Fig. 9 is a rectangular area image of the divided wires.

Fig. 10 is a binarized Mask image of the wire output from the Mask branch.

Fig. 11 is a diagram of a lead frame.

Fig. 12 is a binarized mask image reconstructed from the wire region.

Fig. 13 is a diagram of the effect of the rectangular area image after homomorphic filtering data enhancement.

Fig. 14 is a diagram showing the effect of dividing the conductive wires.

Fig. 15 is a diagram of the effect of the screened rectangular frame on the wire area.

Fig. 16 is a network structure diagram of a shallow convolutional neural network.

Fig. 17 is a diagram illustrating the final wire detection effect.

Detailed Description

The present invention will be further described with reference to the following specific examples.

As shown in fig. 1, the specific conditions of the method for detecting a defect of a wire of a power transmission line based on machine vision provided by this embodiment are as follows:

step 1: the unmanned aerial vehicle is used for carrying out fixed line inspection on the power transmission line, shooting field images of the high-voltage pole tower near the pole tower, and transmitting the field images to a remote server end in a long distance through a 4G network as shown in figure 2.

Step 2, dividing a field image acquired by the unmanned aerial vehicle into a training data set and a testing data set according to a ratio of 7:3, performing point set labeling on the edge of a wire in the training data set by adopting L abelme software to obtain a label file in a json format corresponding to the image, wherein the label file comprises rectangular coordinate data of a wire target in the image, mask point set data and category information, and making the image and the label into the training data set.

And step 3: a Mask R-CNN network is constructed by adopting a pytorech library, and the structure of the Mask R-CNN network is shown in figure 3 and mainly comprises a base network ResNet-50, a region suggestion network (RPN), a region feature aggregation module (RoIAlign), a classification branch, a coordinate regression branch and a Mask branch. In the figure, conv is a conventional convolutional layer, Softmax is a layer for sorting output, and FC is a fully connected layer. The main structure of the whole network is introduced as follows:

the overall structure of the base network ResNet-50 is shown in FIG. 4 and mainly comprises an ID block and a CONV block, wherein the IDblock and the CONV block are mainly composed of a nonlinear activation function Re L U and a network module A, the module A is shown in FIG. 5, the IDblock is shown in FIG. 6, and the CONV block is shown in FIG. 7. CONV2D is a traditional convolutional layer, BatchNorm is a batch normalization layer, Re L U is a nonlinear activation function, MAXPAO L is a maximum pooling layer, AVGPOO L is an average pooling layer, and FC is a full connection layer.

The RPN is composed of 13 × 3, two 1 × 1 convolutional layers and a nonlinear function Softmax, and is mainly used for roughly classifying and coordinate regression of prior rectangular frames generated on a final feature map in a base network, screening is carried out based on classification confidence and overlapping degree of the rectangular frames, and a certain number of potential rectangular frames are obtained and used for subsequent processing.

The feature map in each rectangular frame is firstly divided into 14 × 14 cells on average, the boundary of each cell is not quantized, then fixed four coordinate positions are calculated in each cell, values of four coordinates are calculated by bilinear interpolation, and the maximum pooling operation is carried out based on the values of the four coordinates.

The system comprises a prediction output branch, a coordinate regression branch and a Mask branch, wherein the classification branch consists of a 3 × 3, a 1 × 1 convolutional layer and an output layer Softmax, and outputs a target class and confidence coefficient in a rectangular frame, the coordinate regression branch also consists of a 3 × 3, a 1 × 1 convolutional layer and an output layer Softmax, and outputs coordinates, a wide and high deviation value between the rectangular frame and a real frame, the Mask branch is used for predicting a binary Mask of a target and is of a full convolution network structure, a ResNet-50 network structure is also adopted, the channel number of an intermediate network layer is 256, the channel number of a last layer is the class number, the class number is 2, and the structural diagram is shown in FIG. 8.

The method comprises the steps of firstly extracting features of an input image through ResNet-50, carrying out region suggestion on an RPN (resilient packet network) to obtain a large number of potential rectangular frames, then obtaining a feature map with a fixed size in a feature map region where each rectangular frame is located through RoIAlign, taking the feature map as the input of a classification branch, a coordinate regression branch and a Mask branch, obtaining a binarization Mask of a target at the Mask branch, obtaining a classification result of the target at the classification branch, and obtaining a target positioning deviation value at the coordinate regression branch for coordinate correction.

Step 4, inputting an example segmentation training data set into a Mask R-CNN network, constructing a loss function of the sum of classification loss, coordinate loss and Mask loss as a network training supervision signal, setting a hyper-parameter, setting Batch to be 4, setting an initial learning rate to be 0.001, training by adopting an Adam optimization method, terminating network training when a network converges, and obtaining an example segmentation prediction model, during testing, carrying out scaling processing on an on-site image acquired by an unmanned aerial vehicle to 800 × 800, then predicting the scaled image by adopting a Mask R-CNN prediction model, obtaining the confidence coefficient of a wire at a classification branch, obtaining a target positioning deviation value for coordinate correction at a coordinate regression branch, obtaining the coordinates of the upper left corner and the lower right corner of a rectangular frame of the wire through the classification and the output of the coordinate regression branch, cutting the wire from the on-site image according to the coordinates of the rectangular frame, obtaining a binary Mask image of the wire at the Mask branch as shown in FIG. 9, and obtaining a Mask image of the wire at the Mask branch as shown in FIG. 10.

And 5: and (3) calling a Zhang-Suen skeletonization algorithm by using a skeeleton function in an Opencv library to perform skeletonization processing on the binary mask image of the lead acquired in the step (4) to obtain a binary skeleton image of the lead, which is shown in fig. 11. Counting the number of pixels with the pixel value of 1 in the binary skeleton image as the length of the wire, counting the number of pixels with the pixel value of 1 in the binary wire mask image in the binary skeleton image 10 as the area of the wire, dividing the area of the wire by the length of the wire as the average width value of the wire, and adding 4 to the width value under the condition of considering errors to obtain the final width value.

Step 6: in the binarized skeleton map, traversing all the points with the pixel value of 1, drawing a value of 1 at each point, making a circle with the radius of half of the final width value, and finally obtaining the binarized mask image which is the reconstructed wire mask map, as shown in fig. 12, the width of the mask map obtained by the previous segmentation is more uniform.

And 7: and (3) homomorphic filtering the rectangular area image of the cut wire, and setting an image f (x, y), wherein x represents an abscissa value of the pixel point, y represents an ordinate value of the pixel point, and f (x, y) represents a tristimulus value of the pixel point. F (x, y) is expressed as the product of the illumination component i (x, y) and the reflection component r (x, y). As shown in the formula:

f(x,y)＝i(x,y)·r(x,y)

wherein, 0 < i (x, y) < ∞, 0 < r (x, y) < 1.

Taking logarithm of two sides of the formula, and performing Fourier change to obtain a linear combination frequency domain as shown in the following formula:

ln f(x,y)＝ln i(x,y)+ln r(x,y)

FFT(ln f(x,y))＝FFT(ln i(x,y))+FFT(ln r(x,y))

in the formula, ln represents a logarithmic operation, and FFT represents a fourier transform operation.

Then, a Gaussian high-pass filter is used to adjust the illumination component and the reflection component, so that the high-frequency r (x, y) component is enhanced, the contrast is enhanced, and the low-frequency i (x, y) component is attenuated, and the dynamic range is reduced. After filtering, inverse fourier transform and inverse logarithm are performed to obtain an effect graph after transformation, and the effect graph after homomorphic filtering is shown in fig. 13.

And 8: broadcasting the rectangular region image after homomorphic filtering and the reconstructed binary mask image of the wire region to obtain a segmented wire region image, as shown in fig. 14. And a square frame with the side length being the final width value of the conducting wire is generated at each point with the pixel value being 1 on the binarization skeleton diagram of the figure 11, and coordinate values of the upper left corner and the lower right corner are obtained. And eliminating the overlarge overlapped rectangular frames by adopting a similar idea which is not greatly inhibited. The method comprises the following steps:

8.1) sequencing all square rectangular frames according to the size of the abscissa value of the central coordinate of each square rectangular frame to obtain a series of sequenced rectangular frame coordinate values;

8.2) taking the first rectangular frame coordinate, respectively calculating the size of the intersection ratio of the first rectangular frame coordinate and the rest rectangular frame coordinates, eliminating the rectangular frames with the intersection ratio larger than 0.7, and only keeping the rectangular frame coordinates with the intersection ratio smaller than 0.7. Marking the rectangular frame as a traversed state;

8.3) sequentially removing the next rectangular frame, and if the next rectangular frame is removed, sequentially extending the next rectangular frame;

8.4) repeating the steps of 8.2) and 8.3) until all the rectangular frame coordinates are traversed or eliminated, and obtaining the rectangular frame coordinates with small overlapping degree after screening.

The rectangular box effect graph after screening is shown in fig. 15.

Step 9, obtaining coordinate values of the screened rectangular frames in step 8, marking the positions of the rectangular frames in the lead segmentation graph of fig. 14, cutting and scaling the coordinate values to a fixed size 32 × 32 to obtain an area image of each section of the lead, dividing the area images of all the lead sections into classification training data sets and test data according to a ratio of 8:2, and then dividing the classification training data sets into classes, wherein the classes comprise five types of normal, background, corrosion, damage and broken strands, constructing a shallow convolutional neural network as a classification network, extracting features by taking ResNet-18 as a base network, and finally inputting the classification training data sets, supervising network training by taking cross entropy as a loss function, setting a hyper parameter Batchsize of 128, initializing the weight of 0.001, and obtaining a classification prediction model by using an Adam training network.

Step 10: when the lead is tested for input, the step 8 is adopted to obtain the segmented image of the lead, then each segment is input into the classification prediction model for state recognition, the label information of the image of each segment of the lead is obtained at the network output end, and the lead segment area with the abnormal label is marked on the segmentation graph, as shown in fig. 17. And counting the number of the pictures with the labels of normal and abnormal, and dividing the number of the pictures with the abnormal lead segments by the total number of the rectangular frames to be used as a defect proportion.

In conclusion, by adopting the scheme, the invention provides a new method for detecting the defects of the conducting wire of the power transmission line, realizes accurate segmentation of the conducting wire by adopting deep learning and a traditional image processing algorithm, realizes the segmented state detection of the conducting wire by adopting segmentation and classification ideas, can realize accurate detection and positioning of the defects of the conducting wire, has practical popularization value and is worthy of popularization.

The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that the changes in the shape and principle of the present invention should be covered within the protection scope of the present invention.

Claims

1. A transmission line lead defect detection method based on machine vision is characterized by comprising the following steps:

1) acquiring field image data and making a training data set;

2. The power transmission line lead defect detection method based on machine vision is characterized in that in the step 1), a route inspection is carried out on the power transmission line through an unmanned aerial vehicle, images on the power transmission line are collected and transmitted to a remote server end through a network, wherein the obtained field data are divided into training data and testing data according to the proportion of 7:3, lead point set labeling is carried out on the training data through L abelme software, and a training data set of a lead instance segmentation task is constructed.

3. The machine vision-based transmission line conductor defect detection method of claim 1, characterized in that: in the step 2), conducting wire segmentation is carried out by adopting a multitask Mask R-CNN network, the Mask R-CNN network is constructed by adopting a pytorch, and the network is trained to obtain a prediction model; the Mask R-CNN network mainly comprises a base network, a regional suggestion network RPN, a RoIAlign module, a classification branch, a coordinate regression branch and a Mask branch; the Mask R-CNN network derivation comprises the following steps:

4. The machine vision-based transmission line conductor defect detection method of claim 1, characterized in that: in the step 4), performing skeletonization processing on the binary mask image of the wire by adopting a Zhang-Zu skeletonization algorithm to obtain a binary skeleton map of the wire, counting the number of 1 s in the binary mask and the number of 1 s in the skeleton map, calculating the average width of the wire, performing proper adjustment based on the width, and drawing a circle on each point on the binary skeleton map by taking half of the average width of the wire as a radius to obtain a wire binary mask image with uniform width.

5. The method for detecting the defects of the transmission line conductor based on the machine vision as claimed in claim 1, wherein in the step 5), a homomorphic filtering algorithm is adopted to perform data enhancement on the rectangular area image of the conductor, that is, the gray value of the pixel is regarded as a combination of illumination and reflectivity, wherein the illumination is a low-frequency component of the image, and the reflectivity is a high-frequency component of the image, and the influence of the illumination and the reflectivity on the gray value of the pixel is processed in a frequency domain, so that the dark area representation of the image is enhanced, and the influence of uneven illumination on the image representation is reduced.

6. The machine vision-based transmission line conductor defect detection method of claim 1, characterized in that: in step 6), a square rectangular frame with the side length being the width of the conducting wire is generated for each point in the binary skeleton diagram, and then the rectangular frame is screened based on the non-maximum inhibition idea, wherein the screening step is as follows:

7. The method for detecting the defects of the power transmission line conductor based on the machine vision is characterized in that in the step 7), all rectangular box coordinates in the step 6) are cut and scaled to be 32 × 32 fixed size, a training data set and a testing data set are divided according to the proportion of 8:2, the training data set is classified into five types, namely normal, background, corrosion, damage and strand breakage, a shallow convolutional neural network is constructed by adopting pytorch as a classification network, the cross entropy is used as a loss function to supervise network training, and the training data set is input into the classification network to obtain a classification prediction model.

8. The machine vision-based transmission line conductor defect detection method of claim 1, characterized in that: in step 8), the conductor segment region picture obtained by the test data is input into a prediction classification model for forward calculation, a state label of each conductor segment region is obtained at the output end of the network, the defect type of the whole conductor is counted based on the state label, namely the defect type comprises the category set of all the conductor region labels, and the defect proportion is counted, namely the proportion of the number of the rectangular frames with defects in the number of all the rectangular frames is calculated.