CN110826647A

CN110826647A - Method and system for automatically detecting foreign matter appearance of power equipment

Info

Publication number: CN110826647A
Application number: CN201911253596.5A
Authority: CN
Inventors: 张传友; 孙志周; 李健; 徐攀; 刘晓芬; 邢军; 杨国庆; 邵光亭; 王亚飞; 邓燕; 王贤华
Original assignee: State Grid Intelligent Technology Co Ltd
Current assignee: State Grid Intelligent Technology Co Ltd
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2020-02-21

Abstract

The invention provides a method and a system for automatically detecting the appearance of foreign matters in power equipment, which are used for acquiring a detection image in a transformer substation and compressing a picture to a specified pixel value; adjusting the rotation angle and the chromaticity parameters of the picture; using the adjusted part of image as a training data set, and performing forward propagation operation on the training data set by using a deep neural network to obtain each layer and a total loss value, wherein the total loss value comprises confidence loss of whether a boundary box contains an object, each category loss of each boundary box and position loss of the boundary box; and carrying out back propagation operation on the weights of all layers according to loss to obtain a trained deep learning model, and detecting the foreign matter appearance of the transformer substation in the image by using the trained deep learning model.

Description

Method and system for automatically detecting foreign matter appearance of power equipment

Technical Field

The disclosure relates to the field of computer vision image processing technology target detection, and relates to a method and a system for automatically detecting foreign matter appearance of power equipment.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

At present, the automatic inspection technology of the transformer substation is developed more and more quickly, the application of the inspection robot of the transformer substation is more and more extensive, and during actual application, appearance recognition is often required to be carried out on foreign matters in the transformer substation so as to help the robot to better recognize the categories of the foreign matters and execute the operation of avoiding obstacles and the like.

According to the inventor, foreign matters needing to be detected by the substation at present are as follows: the current target detection methods such as HOG, SIFT and the like generally separate image feature extraction and classification processes, and the methods firstly use a feature model to extract relevant visual features of an image and then use a classifier such as SVM to identify. However, in practical applications, the lack of a training set of practical scenes including detection targets and the complexity of the practical detection scenes are main reasons for the high false detection rate of target detection.

Disclosure of Invention

The invention aims to solve the problems and provides an automatic detection method and system for foreign matter appearance of power equipment, and the method and system can greatly reduce the false detection rate of target detection aiming at complex detection application scenes under the condition of lacking an actual appearance detection scene training set containing a detection target, have strong robustness on illumination change and scene change, are high in operation speed, can run and detect in real time, are stable and efficient, and overcome the defects caused by the fact that the prior art is applied to actual scenes.

According to some embodiments, the following technical scheme is adopted in the disclosure:

an automatic detection method for foreign matter appearance of power equipment comprises the following steps:

acquiring a detection image in a transformer substation, and compressing the image to a specified pixel value;

adjusting the rotation angle and the chromaticity parameters of the picture;

using the adjusted part of image as a training data set, and performing forward propagation operation on the training data set by using a deep neural network to obtain each layer and a total loss value, wherein the total loss value comprises confidence loss of whether a boundary box contains an object, each category loss of each boundary box and position loss of the boundary box;

and carrying out reverse propagation operation on the weight of each layer according to the loss to obtain a trained deep learning model, and detecting the appearances of the foreign matters of the transformer substations such as bird nests, kites, plastic bags or ropes in the images by using the trained deep learning model.

According to the technical scheme, the false detection rate of target detection can be greatly reduced aiming at complex detection application scenes under the condition that an actual appearance detection scene training set containing a detection target is lacked, and the method has strong robustness on illumination change and scene change and is high in running speed.

As an implementation manner, in the process of acquiring a detection image in a substation, any object type of a selected actual scene picture not including a detection object cannot appear.

As an implementation manner, the rotation angle of the picture is adjusted at 45 degrees clockwise, and the pixel values of three RGB colors are adjusted (adjusted according to the actual condition of the data) to increase the number of training data sets, so as to improve the detection effect and the model generalization capability.

In one embodiment, the deep neural network comprises 24 convolutional layers and 2 fully-connected layers.

In one embodiment, in the forward propagation operation using the deep neural network, only the confidence loss of an actual scene picture not including a detection target, which has no tag information used as a negative sample, is calculated.

In the process of carrying out forward propagation on a negative sample training set picture containing no detection target by using a deep neural network, stretching the picture with a set channel number into a column vector according to the size of a convolution kernel, inputting the column vector into a first convolution layer of the deep neural network for convolution operation, sequentially carrying out convolution or pooling or full-connection operation on an output matrix according to a deep neural network structure, wherein before carrying out convolution operation each time, the input matrix is stretched into the column vector according to the size of the convolution kernel;

and inputting the training data set picture into the neural network structure for forward propagation to obtain the probability of each category of each grid, the coordinate information of each frame and the confidence coefficient of the contained object.

As an embodiment, in the forward propagation operation using the deep neural network, the deep neural network structure is composed of 24 convolutional layers and 4 max pooling layers and two full-connected layers, and in this embodiment, the convolutional neural network is specifically described as follows:

the network input layer is the entire RGB image of size 448 x 3;

the first part consists of convolution layers with convolution kernel size of 7 × 7, step size of 2 and channel number of 64 and pooling layers with kernel size of 2 × 2 and step size of 2;

the second part consists of convolution layers with convolution kernel size of 3 × 3, step size of 1 and channel number of 192 and pooling layers with kernel size of 2 × 2 and step size of 2;

the third part consists of convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 128 and convolution layers with convolution kernel size of 3 × 3, step size of 1, convolution layers with channel number of 256 and convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 256 and convolution layers with convolution kernel size of 3 × 3, step size of 1, channel number of 512 and pooling layers with kernel size of 2 × 2 and step size of 2;

the fourth part consists of four groups of convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 256, convolution layer with convolution kernel size of 3 × 3, step size of 1, channel number of 512, convolution layer with convolution kernel size of 1 × 1, step size of 1, channel number of 512, convolution layer with convolution kernel size of 3 × 3, step size of 1, channel number of 1024, and pooling layer with kernel size of 2 × 2 and step size of 2;

the fifth part is composed of two groups of convolution kernels with the sizes of 1 × 1, the step size of 1, the number of channels of 512 convolution layers and the sizes of convolution kernels with the sizes of 3 × 3, the step size of 1, the number of channels of 1024 convolution layers, one convolution kernel with the size of 3 × 3, the step size of 2 and the number of channels of 1024 convolution layers;

the sixth part consists of convolution layers with two convolution kernels of 3 x 3 in size, 1 in step length and 1024 in channel number;

the seventh and eighth portions are two fully connected layers, respectively.

As an embodiment, a small loss value is assigned to the confidence loss of the bounding box without an object; the loss values for the confidence loss and class loss for bounding boxes with objects normally take 1.

As an embodiment, the specific process of updating the weights of the layers by using back propagation includes: the sensitivity map is calculated, then the gradient required by bias updating and the gradient required by weight updating are calculated, and finally the bias and the weight are updated through gradient descent according to the gradient.

An automatic detection system for foreign matter appearance of power equipment comprises:

the image acquisition module is configured to acquire a detection image in the transformer substation and compress the picture to a specified pixel value;

the adjusting module is configured to adjust the rotation angle and the chromaticity parameters of the picture;

a loss calculation module configured to perform forward propagation operation on the training data set by using the adjusted part of the image as a training data set, so as to obtain each layer and a total loss value, where the total loss value includes a confidence loss of whether the bounding box includes an object, a category loss of each bounding box, and a position loss of the bounding box;

and the detection module is configured to perform back propagation operation on the weight of each layer according to the loss to obtain a trained deep learning model, and the trained deep learning model is used for detecting the appearances of the foreign matters of the transformer substation such as bird nests, kites, plastic bags or/and ropes in the image.

A computer readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute a method for automatic detection of foreign object appearance in an electrical device.

A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the automatic foreign matter appearance detection method for the power equipment.

Compared with the prior art, the beneficial effect of this disclosure is:

the invention innovatively provides an automatic detection method for foreign matter appearance of power equipment, which solves the problem of high target false detection rate of foreign matter, realizes detection of foreign matter appearance under different scenes, illumination and other conditions, and greatly improves the precision and efficiency of foreign matter detection.

The foreign body form target detection model is constructed, the multiple sample expansion enhancement technology of the training set is designed, the problem that the detection rate of the traditional recognition technology for the foreign body form target is low is solved, the reliability of foreign body detection is improved, and the false alarm rate is reduced.

The method can greatly reduce the false detection rate of target detection aiming at complex detection application scenes under the condition of lacking an actual appearance detection scene training set containing a detection target, has stronger robustness on illumination change and scene change, has high running speed, can run and detect on a processor in real time, is stable and efficient, and overcomes the defects caused by the application of the prior art to the actual scene.

According to the method, the actual scene picture without the detection target is added in the training set to be used as the negative sample, and the loss of the partial background picture is considered during the loss calculation, so that the false detection rate of the model to the background in the actual detection scene is greatly reduced.

The deep neural network of the present disclosure is deeper in hierarchy, and is composed of 24 convolutional layers, 4 pooling layers, and 2 fully-connected layers. After the characteristics are extracted, only one branch is used for realizing the purposes of identification and positioning at the same time; for the processing of the negative sample, the actual scene image without the detection target is added to be used as the negative sample, and meanwhile, the false detection rate of the background is greatly reduced through special network processing.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and are not to limit the disclosure.

FIG. 1 is a flow chart of the present disclosure;

FIG. 2 is a diagram of a network architecture used;

FIG. 3 is a schematic illustration of a multi-channel image matrix steering amount used in the present disclosure;

FIG. 4 is a schematic diagram of a convolution operation used in the present disclosure;

FIGS. 5(a) - (c) are graphs of the effect of detection.

The specific implementation mode is as follows:

the present disclosure is further described with reference to the following drawings and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in fig. 1, the method for automatically detecting the foreign object appearance of the power equipment based on deep learning includes the following steps:

the collected picture data are labeled by using a picture labeling tool, the picture data comprise non-actual scene pictures containing detection targets and actual scene pictures not containing the detection targets, and it is noted that the selected actual scene pictures not containing the detection targets cannot have any target types, otherwise, the trained model can identify the targets as backgrounds during detection, so that the accuracy is reduced.

The picture resize is compressed to a fixed width and height of smaller pixel values to increase the detection speed.

The number of training data sets is increased by adjusting the rotation angle, exposure, saturation, hue and the like of the picture, so that the detection effect and the model generalization capability are improved.

Inputting the input image subjected to the resize and data customization (step 3) operation into a single deep neural network comprising 24 convolutional layers and 2 fully-connected layers to perform a forward propagation operation, and obtaining a loss value of each layer and a total loss value, wherein the total loss value comprises the confidence loss of whether the bounding box comprises the object, the category loss of each bounding box and the position loss of the bounding box. It should be noted that for an actual scene picture without label information as negative example, which does not contain a detection target, we only calculate its confidence loss.

And performing back propagation operation on each layer weight according to the loss.

From this point on, the model training part is finished, and then the model can be used to perform the actual examination on the test picture.

The specific steps and details of forward propagation of the negative sample training set picture containing no detection target by using the deep neural network in the step 4) are as follows:

4-1) in order to facilitate convolution matrix operation, the RGB image with the channel number of 3 is input into the first convolution layer of the deep neural network for convolution operation after being stretched into column vectors according to the size of a convolution kernel, as shown in fig. 3 and 4, the output matrix is sequentially subjected to convolution or pooling or full connection operation according to the structure of the deep neural network, wherein the input matrix is stretched into the column vectors according to the size of the convolution kernel before each convolution operation.

4-2) the deep neural network structure used above consists of 24 convolutional layers and 4 max pooling layers and two fully-connected layers, as shown in FIG. 2, the convolutional neural network is described in detail as follows:

the network input layer is the entire RGB image of size 448 x 3;

the seventh and eighth portions are two fully connected layers, respectively.

After the training data set picture is input into the YOLO network structure and forward propagation is carried out, the probability of each category of each grid, the coordinate information x, y, w and h of each frame and the confidence coefficient of the contained object can be obtained.

4-3) the specific steps described in step 4 to calculate the loss are as follows:

since the design goal of the penalty function is to balance the coordinates (x, y, w, h), confidence, and classification into three dimensions, it is not reasonable to simply use sum-squared error loss to do this simply because ① location loss and classification loss of different dimensions are equally important, ② if there are no objects in some grids (there are many such grids in a graph), then the confidence of the bounding boxes in these grids is set to 0, and compared to fewer grids with objects, these grids without objects contribute much more to the gradient update than grids with objects, which can lead to network instability and even divergence.

Giving a small loss weight to the confidence loss of the bounding box without the object, and marking as lambda nonobj; loss weight for the confidence loss and class loss of bounding box with object normally takes 1. It should be noted that for the added actual scene picture used as a negative example, since there are no labeled bounding box coordinates and class information, a small loss weight is also given to the confidence of such bounding box without objects.

The specific calculation details for updating the weights of the layers in step 5) by using back propagation are as follows:

the sensitivity map is calculated, then the gradient required by bias updating and the gradient required by weight updating are calculated, and finally the bias and the weight are updated through gradient descent according to the gradient.

As a typical embodiment, the transformer substation foreign matter appearance automatic detection algorithm based on deep learning:

the first step is as follows: and marking the picture data.

For the collected network image containing the detection target and the collected actual scene image not containing the detection target, the LabelImage tool is used for labeling the images, note that for the images containing the detection target, category and bounding box coordinate information are labeled on the pictures according to a normal labeling step, and for the pictures not containing the detection target, an empty label file needs to be generated.

The second step is that: and (5) preprocessing a training set.

After the picture resize is adjusted to 448 × 448 pixels, the number of training data sets is increased by adjusting the rotation angle, exposure, saturation, hue and the like of the picture, so that the detection effect and the model generalization capability are improved.

The third step: and (4) forward propagation.

In the embodiment, a convolutional neural network consisting of 24 convolutional layers, 4 maximum pooling layers and two full-connection layers is trained to predict the probability that the target grid contains the object and the probability that the object belongs to the category under the condition that the object is contained in the target frame coordinate memory. The structure of the convolutional neural network model is shown in FIG. 2: the network input layer is an RGB image of size 448 x 3; the first part consists of convolution layers with convolution kernel size of 7 × 7, step size of 2 and channel number of 64 and pooling layers with kernel size of 2 × 2 and step size of 2; the second part consists of convolution layers with convolution kernel size of 3 × 3, step size of 1 and channel number of 192 and pooling layers with kernel size of 2 × 2 and step size of 2; the third part consists of convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 128 and convolution layers with convolution kernel size of 3 × 3, step size of 1, convolution layers with channel number of 256 and convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 256 and convolution layers with convolution kernel size of 3 × 3, step size of 1, channel number of 512 and pooling layers with kernel size of 2 × 2 and step size of 2; the fourth part consists of four groups of convolution layers with convolution kernel size of 1 × 1, step size of 1, channel number of 256, convolution layer with convolution kernel size of 3 × 3, step size of 1, channel number of 512, convolution layer with convolution kernel size of 1 × 1, step size of 1, channel number of 512, convolution layer with convolution kernel size of 3 × 3, step size of 1, channel number of 1024, and pooling layer with kernel size of 2 × 2 and step size of 2; the fifth part is composed of two groups of convolution kernels with the sizes of 1 × 1, the step size of 1, the number of channels of 512 convolution layers and the sizes of convolution kernels with the sizes of 3 × 3, the step size of 1, the number of channels of 1024 convolution layers, one convolution kernel with the size of 3 × 3, the step size of 2 and the number of channels of 1024 convolution layers; the sixth part consists of convolution layers with two convolution kernels of 3 x 3 in size, 1 in step length and 1024 in channel number; the seventh and eighth portions are two fully connected layers, respectively. After the training data set picture is input into the YOLO network structure for forward propagation, the probability of each category of each grid, the coordinate information x, y, w and h of each frame and the confidence coefficient of an object can be obtained, the loss of the network can be obtained by combining the labeling information of each picture, and the loss is calculated as follows:

the fourth step: and is propagated in the reverse direction.

For the back propagation process, assume a^lFor the output of the l layer, z^lIs the value before the activated function, i.e. a^l＝σ(z^l) C is an error, b^lFor the first layer bias, w^lIs the weight of the l-th layer,

sensitivity map of the l-th layer, α learning rate, ⊙ element-by-element product.

Firstly, a sensitivity map delta, also called an error map, is calculated:

if l layers are fully connected layers:

δ^l-1＝(W^l)^Tδ^l⊙σ′(z^l-1)

if l-1 layers are convolutional layers:

δ^l-1＝δ^l*rot180(w^l)⊙σ′(z^l-1)

bias updates required gradient:

full connection layer:

and (3) rolling layers:

wherein w, h represent the size of the convolution kernel output;

the weight updates the required gradient, here the convolution sign;

full connection layer

Convolutional layer

Finally the bias and weights are updated by gradient descent according to the above:

as shown in fig. 5(a), 5(b), and 5(c), it can be seen that the present embodiment can detect foreign objects such as bird nests, kites, plastic bags, ropes, and the like. By adopting the optimization process considering the negative sample in the embodiment, the accuracy of image identification can be obviously improved. In the embodiment, the actual scene picture not containing the detection target is added in the training set to be used as the negative sample, and the loss of the background picture is considered during the loss calculation, so that the false detection rate of the model to the background in the actual detection scene is greatly reduced.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Although the present disclosure has been described with reference to specific embodiments, it should be understood that the scope of the present disclosure is not limited thereto, and those skilled in the art will appreciate that various modifications and changes can be made without departing from the spirit and scope of the present disclosure.

Claims

1. An automatic detection method for foreign matter appearance of power equipment is characterized by comprising the following steps: the method comprises the following steps:

adjusting the rotation angle and the chromaticity parameters of the picture;

and carrying out back propagation operation on the weights of all layers according to loss to obtain a trained deep learning model, and detecting the foreign matter appearance of the transformer substation in the image by using the trained deep learning model.

2. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: in the process of obtaining the detection image in the transformer substation, the selected actual scene picture which does not contain the detection target cannot have any target category.

3. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: and adjusting the rotation angle, exposure, saturation and hue of the picture to increase the number of training data sets so as to improve the detection effect and the generalization capability of the model.

4. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: adjusting the rotation angle of the picture and adjusting the pixel values of RGB three colors at intervals of 45 degrees clockwise to increase the number of training data sets so as to improve the detection effect and the model generalization capability;

or, in the forward propagation operation by using the deep neural network, only the confidence loss of the actual scene picture which does not contain the detection target and has no label information used as a negative sample is calculated.

5. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: in the process of carrying out forward propagation on a negative sample training set picture containing no detection target by using a deep neural network, stretching the picture with a set channel number into a column vector according to the size of a convolution kernel, inputting the column vector into a first convolution layer of the deep neural network for convolution operation, sequentially carrying out convolution or pooling or full-connection operation on an output matrix according to a deep neural network structure, wherein before carrying out convolution operation each time, the input matrix is stretched into the column vector according to the size of the convolution kernel;

6. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: giving a small loss value to the confidence loss of the bounding box without the object; the loss values for the confidence loss and class loss for bounding boxes with objects normally take 1.

7. The method for automatically detecting the foreign matter appearance of the electric power equipment as claimed in claim 1, wherein the method comprises the following steps: the specific process of updating the weights of each layer by utilizing back propagation comprises the following steps: the sensitivity map is calculated, then the gradient required by bias updating and the gradient required by weight updating are calculated, and finally the bias and the weight are updated through gradient descent according to the gradient.

8. An automatic detection system for foreign matter appearance of power equipment is characterized in that: the method comprises the following steps:

9. A computer-readable storage medium characterized by: a plurality of instructions stored therein, the instructions being adapted to be loaded by a processor of a terminal device and to perform a method for automatic detection of foreign object appearance of an electrical device according to any one of claims 1-7.

10. A terminal device is characterized in that: the system comprises a processor and a computer readable storage medium, wherein the processor is used for realizing instructions; the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the power equipment foreign matter appearance automatic detection method of any one of claims 1-7.